Glutamine-rich maize seed protein and promoter

ABSTRACT

Provided is a nucleotide sequence encoding a 55 kDa maize prolamin family protein. Also provided is a nucleotide sequence derived from the promoter of the 55 kDa maize gene that can be used to express heterologous sequences in plants, and methods of using the disclosed nucleotide sequences.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional application of U.S. patent applicationSer. No. 11/074,522, filed Mar. 8, 2005, the disclosure of which isincorporated herein by reference in its entirety. This application alsoclaims the benefit of U.S. Provisional Patent Application Ser. No.60/551,286, filed Mar. 8, 2004, the disclosure of which is alsoincorporated herein by reference in its entirety.

TECHNICAL FIELD

The presently disclosed subject matter relates generally to nucleicacids isolated from Zea mays. More particularly, the presently disclosedsubject matter relates to nucleotide sequences encoding a member of theprolamin family of Zea mays cereal seed storage proteins, and tonucleotide sequences of a promoter of said prolamin family member. Alsoprovided are methods of using the disclosed nucleic acid molecules intransgenic plants.

TABLE OF ABBREVIATIONS

-   -   2,4-D—2,4-dichlorophenoxyacetic acid    -   AMV—alfalfa mosaic virus    -   β-ME—β-mercaptoethanol    -   BiP—human immunoglobulin heavy-chain binding polypeptide    -   bp—basepair(s)    -   CAB—chlorophyll a/b binding    -   CaMV—cauliflower mosaic virus    -   CDPK—maize calcium-dependent protein kinase    -   cM—centimorgan(s)    -   CMV—cytomegalovirus    -   DEAE—diethyl aminoethyl    -   DHFR—dihydrofolate reductase    -   DTT—dithiothreitol    -   EDTA—ethylene diamine tetraacetic acid    -   ELISA—enzyme-linked immunosorbent assay    -   EMCV—encephalomyocarditis virus    -   EPSP—5-enol-pyruvyl shikimate-3-phosphate    -   ER—endoplasmic reticulum    -   EST—expressed sequence tag    -   GUS—β-glucuronidase    -   HEPES—N-(2-hydroxyethyl)piperazine-N′-(2-ethanesulfonic acid)    -   HSPs—high scoring sequence pairs    -   HSV-tk—Herpes Simplex Virus thymidine kinase    -   kb—kilobase(s)    -   k_(cat)—catalytic constant    -   kDa—kilodalton(s)    -   k_(m)—Michaelis constant    -   MCMV—maize chlorotic mottle virus    -   MDMV—maize dwarf mosaic virus    -   MTL—metallothionein-like    -   ORF—open reading frame    -   PAGE—polyacrylamide gel electrophoresis    -   PEG—polyethylene glycol    -   PEPC—phosphoenol carboxylase    -   pgk—phosphoglycerate kinase    -   Protox—protoporphyrinogen oxidase    -   psi—pounds per square inch    -   RFLPs—restriction fragment length polymorphisms    -   RUBISCO—ribulose-1,5-bisphosphate carboxylase/oxygenase    -   SDS—sodium dodecyl sulfate    -   SDS-PAGE—sodium dodecyl sulfate polyacrylamide gel        electrophoresis    -   SSC—standard saline citrate    -   SSPE—standard saline-phosphate-EDTA    -   TEV—tobacco etch virus    -   T_(m)—thermal melting point    -   TMTD—tetramethylthiuram disulfide

TMV—tobacco mosaic virus 3- 1- Amino Acid Letter Letter Codons AlanineAla A GCA; GCC; GCG; GCU Arginine Arg R AGA; AGG; CGA; CGC; CGG; CGUAsparagine Asn N AAC; AAU Aspartic; Acid Asp D GAC; GAU Cysteine Cys CUGC; UGU Glutamic; acid Glu E GAA; GAG Glutamine Gln Q CAA; CAG GlycineGly G GGA; GGC; GGG; GGU Histidine His H CAC; CAU Isoleucine Ile I AUA;AUC; AUU Leucine Leu L UUA; UUG; CUA; CUC; CUG; CUU Lysine Lys K AAA;AAG Methionine; Met M AUG Phenylalanine Phe F UUC; UUU Proline Pro PCCA; CCC; CCG; CCU Serine Ser S ACG; AGU; UCA; UCC; UCG; UCU ThreonineThr T ACA; ACC; ACG; ACU Tryptophan Trp W UGG Tyrosine Tyr Y UAC; UAUValine Val V GUA; GUC; GUG; GUU

BACKGROUND

An objective of crop trait functional genomics is to identify crop traitgenes of interest, for example, genes capable of conferring usefulagronomic traits in crop plants. Such agronomic traits include, but arenot limited to, enhanced yield, whether in quantity or quality; enhancednutrient acquisition and metabolic efficiency; enhanced or alterednutrient composition of plant tissues used for food, feed, fiber, orprocessing; enhanced utility for agricultural or industrial processing;enhanced resistance to plant diseases; enhanced tolerance of adverseenvironmental conditions including, but not limited to, drought,excessive cold, excessive heat, or excessive soil salinity or extremeacidity or alkalinity; and alterations in plant architecture ordevelopment, including changes in developmental timing. The deploymentof such identified trait genes by either transgenic or non-transgenicapproaches can materially improve crop plants for the benefit ofagriculture.

Cereals are the most important crop plants on the planet in terms ofboth human and animal consumption. Genomic synteny (conservation of geneorder within large chromosomal segments) is observed in rice, maize,wheat, barley, rye, oats, and other agriculturally important monocotsincluding sorghum (see e.g., Kellogg, 1998; Song et al., 2001, andreferences therein), which facilitates the mapping and isolation oforthologous genes from diverse cereal species based on the sequence of asingle cereal gene. Rice has the smallest (about 420 Mb) genome amongthe cereal grains, and has recently been a major focus of public andprivate genomic and EST sequencing efforts. See Goff et al., 2002.

The identification of genes that are important for crop development isan ongoing effort in the agricultural community. Additional informationcan also be derived from the analysis of the genomes of variousimportant plants. For example, the identification of regulatory elementsthat control the expression of genes can also lead to the ability tomanipulate the plant genome to express polypeptides of interest inspecific tissues. In particular, certain plants are becoming theorganisms of choice for large-scale production of commercially importantproteins such as enzymes. This strategy takes advantage of the fact thatduring seed development, endosperm cells synthesize large amounts ofstorage proteins of the zein family, which are deposited in structuresknown as protein bodies derived from the endoplasmic reticulum. Theseprotein bodies cofractionate with the gluten fraction produced in cornwet milling. The potential therefore exists to generate large quantitiesof recombinant enzymes in a form associated with gluten or in a morepure form following release of the recombinant enzyme activity from thegluten-associated or immobilized state.

What are needed, then, are new methods and reagents for expressingheterologous nucleotide sequences in plant cells. To meet these needs,the presently disclosed subject matter provides in some embodiments apromoter sequence for directing expression of heterologous nucleotidesequences in plant cells. Also provided are methods for expressingheterologous nucleotide sequences in plant cells using the disclosedpromoter.

The presently disclosed subject matter addresses these problemsassociated with the expression of nucleotide sequences in transgenicplants, as well as other problems.

SUMMARY

This Summary lists several embodiments of the presently disclosedsubject matter, and in many cases lists variations and permutations ofthese embodiments. This Summary is merely exemplary of the numerous andvaried embodiments. Mention of one or more representative features of agiven embodiment is likewise exemplary. Such an embodiment can typicallyexist with or without the feature(s) mentioned; likewise, those featurescan be applied to other embodiments of the presently disclosed subjectmatter, whether listed in this Summary or not. To avoid excessiverepetition, this Summary does not list or suggest all possiblecombinations of such features.

The presently disclosed subject matter provides methods for expressing anucleotide sequence in a plant. In some embodiments, the methodcomprises (a) operably linking the nucleotide sequence to a promotercomprising SEQ ID NO: 6 to produce an expression cassette; and (b)generating a transgenic plant comprising the expression cassette,whereby the nucleotide sequence is expressed in the plant. In someembodiments, the generating comprises transforming a plant cell with theexpression cassette and regenerating the plant from the transformedplant cell. In some embodiments, the generating comprises homologouslyrecombining the nucleotide sequence into an endogenous genetic locusunder the control of a promoter comprising SEQ ID NO: 6. In someembodiments, the transforming is by biolistic transformation of a vectorcomprising the expression cassette. In some embodiments, the vector is abinary Agrobacterium expression vector.

In some embodiments, the nucleotide sequence encodes a polypeptideselected from the group consisting of carbohydrases, cellulases,hemicellulases, pectinases, isomerases, lyases, proteases, heat shockproteins, chaperonins, phytases, insecticidal proteins, antimicrobialproteins, a-amylases, glucoamylases, glucanases, glucosidases,xylanases, ferulic acid esterases, galactosidases, pectinases, andchymosin.

The presently disclosed subject matter also provides methods forexpressing a nucleotide sequence in a plant. In some embodiments, themethod comprises (a) operably linking the nucleotide sequence to a plantpromoter, the nucleotide sequence comprising SEQ ID NO: 1 to produce anexpression cassette; and (b) generating a transgenic plant comprisingthe expression cassette, whereby the nucleotide sequence is expressed inthe plant. In some embodiments, the generating comprises transforming aplant cell with the expression cassette and regenerating the plant fromthe transformed plant cell. In some embodiments, the transforming is bybiolistic transformation of a vector comprising the expression cassette.In some embodiments, the vector is a binary Agrobacterium expressionvector.

The presently disclosed subject matter also provides methods forproducing a heterologous polypeptide in a plant cell. In someembodiments, the method comprises (a) generating a plant cell comprisinga nucleotide sequence encoding the heterologous polypeptide operablylinked to SEQ ID NO: 6; and (b) expressing in the plant cell thenucleotide sequence encoding the heterologous polypeptide, whereby theheterologous polypeptide is produced in the plant cell. In someembodiments, the generating comprises transforming the plant cell withan expression cassette comprising the nucleotide sequence encoding theheterologous polypeptide. In some embodiments, the generating compriseshomologously recombining the nucleotide sequence into an endogenousgenetic locus under the control of a promoter comprising SEQ ID NO: 6such that the nucleotide sequence becomes operably linked to SEQ ID NO:6. In some embodiments, the transforming is by biolistic transformationof a vector comprising the expression cassette. In some embodiments, thevector is a binary Agrobacterium expression vector. In some embodiments,the instant method further comprises regenerating a plant from the plantcell. In some embodiments, the instant method further comprisesisolating the polypeptide from the plant. In some embodiments, thepolypeptide is located within a protein body of endoplasmic reticulum ofthe plant cell.

The methods and compositions of the presently disclosed subject mattercan be used to produce a heterologous polypeptide of interest in a plantcell. In some embodiments, the nucleotide sequence encodes a polypeptideselected from the group consisting of carbohydrases, cellulases,hemicellulases, pectinases, isomerases, lyases, proteases, heat shockproteins, chaperonins, phytases, insecticidal proteins, antimicrobialproteins, a-amylases, glucoamylases, glucanases, glucosidases,xylanases, ferulic acid esterases, galactosidases, pectinases, andchymosin.

The presently disclosed subject matter also provides methods fortargeting a protein of interest to a structure of a plant cell selectedfrom the group consisting of endoplasmic reticulum (ER) and apoplast. Insome embodiments, the method comprises (a) fusing a nucleic acidmolecule encoding a signal sequence of a Zea mays Q protein in frame toa nucleotide sequence encoding the protein of interest, wherein thenucleic acid molecule encoding a signal sequence of a Zea mays Q proteinand the nucleotide sequence encoding the protein of interest areoperably linked to a promoter to produce a plant expression construct;and (b) transforming the plant cell with the plant expression construct,whereby the protein of interest is targeted to the structure.

The presently disclosed subject matter also provides methods forproducing a plant seed with an increased nutritional value. In someembodiments, the method comprises (a) transforming a plant cell with anexpression vector comprising a nucleotide sequence encoding SEQ ID NO:2, or a fragment or derivative thereof; (b) regenerating a plant fromthe transformed plant cell; and (c) isolating a seed from theregenerated plant, whereby a seed with an increased nutritional value isproduced. In some embodiments, the increased nutritional value isselected from the group consisting of an increased level of an essentialamino acid, an improved amino acid balance, and an improved amino aciddigestibility, when compared to a seed from a non-transformed plant ofthe same species.

The presently disclosed subject matter also provides methods fortargeting a protein of interest to a protein body in a plant. In someembodiments, the method comprises (a) fusing a nucleic acid moleculeencoding SEQ ID NO: 2, or a fragment or derivative thereof, in frame toa nucleotide sequence encoding the protein of interest, wherein thenucleic acid molecule encoding SEQ ID NO: 2, or the fragment orderivative thereof, and the nucleotide sequence encoding the protein ofinterest are operably linked to a promoter to produce a plant expressionconstruct; and (b) transforming the plant cell with the plant expressionconstruct, whereby the protein of interest is targeted to a protein bodyin the plant.

The presently disclosed subject matter also provides isolated nucleicacid molecules, expression cassettes, recombinant vectors, cells, andtransgenic plants comprising the disclosed nucleic acid molecules andexpression cassettes. In some embodiments, the nucleic acid moleculesand expression cassettes comprise SEQ ID NO: 1, and in some embodimentsthe nucleic acid molecules and expression cassettes comprise SEQ ID NO:6. In some embodiments, the expression cassette is expressed in seed anda polypeptide encoded by the expression cassette is located within aprotein body of endoplasmic reticulum of a cell of the seed.

The methods and compositions of the presently disclosed subject mattercan be used with any plant species. In some embodiments, the plant is amonocot. In some embodiments, the monocot is selected from the groupconsisting of rice, maize, wheat, barley, oats, rye, millet, sorghum,triticale, secale, einkorn, spelt, emmer, teff, milo, flax, grammagrass, Tripsacum, and teosinte. In some embodiments, the plant isselected from the group consisting of rice, wheat, barley, rye, maize,potato, canola, soybean, sunflower, carrot, sweet potato, sugarbeet,bean, pea, chicory, lettuce, cabbage, cauliflower, broccoli, turnip,radish, spinach, asparagus, onion, garlic, eggplant, pepper, celery,squash, pumpkin, cucumber, apple, pear, quince, melon, plum, cherry,peach, nectarine, apricot, strawberry, grape, raspberry, blackberry,pineapple, avocado, papaya, mango, banana, soybean, tobacco, tomato,sorghum and sugarcane. In some embodiments, the plant is maize.

In some embodiments of the presently disclosed subject matter, theexpression cassette is expressed in a tissue selected from the groupconsisting of the epidermis, root, vascular tissue, meristem, cambium,cortex, pith, leaf, flower, seed, and combinations thereof. In someembodiments, the expression cassette is expressed in a seed and apolypeptide encoded by the nucleotide sequence is located within aprotein body of endoplasmic reticulum of a cell of the seed.

Accordingly, it is an object of the presently disclosed subject matterto provide reagents and methods for expressing heterologous sequences inplant cells. This and other objects are achieved in whole or in part bythe presently disclosed subject matter.

An object of the presently disclosed subject matter having been statedhereinabove, and which are addressed in whole or in part by thepresently disclosed subject matter, other objects will become evident asthe description proceeds when taken in connection with the accompanyingdrawings as best described hereinbelow.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts Western blot analysis of flour extracts using antisera topurified Nov9x phytase. The first lane on the blot depicts the positionsof molecular weight standards (labeled on left). A188 is an extract ofseed from non-transgenic corn. Nov9x phytase (predicted molecular weight47 kDa) is identified as the prominent band migrating in the region ofthe 55 kDa standard. Nov9x phytase was not detected in the negativecontrol (i.e. A188) corn flour extract.

FIG. 2 depicts SDS-PAGE analysis of proteins extracted from endospermflour (endo.) or from embryo paste (embryo) from non-transgenic corn.Samples were run before (−) and after (+) heating the extract to 80° C.for 20 minutes.

FIG. 3 is a bar graph depicting the results of assaying α-galactosidaseactivity in flour from transgenic corn seeds that contain a fusionprotein including the Q protein amino acid sequence fused toα-galactosidase. Activity is expressed as α-galactosidase units per gramof flour from individual seeds.

FIG. 4 is a bar graph depicting the results of assaying α-galactosidaseactivity in flour from transgenic corn seeds that contain a fusionprotein including the Q protein signal sequence fused toα-galactosidase. Activity is expressed as α-galactosidase units per gramof flour from individual seeds.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING

SEQ ID NO: 1 is a nucleotide sequence of the open reading frame of the Qprotein gene.

SEQ ID NO: 2 is an amino acid sequence encoded by SEQ ID NO: 1.

SEQ ID NO: 3 is a nucleotide sequence from the Q protein cDNA that wasemployed in genomic walking experiments to identify the promoter of theQ protein gene (SEQ ID NO: 6).

SEQ ID NOs: 4 and 5 are the nucleotide sequences of twoQ-protein-specific primers that were employed in the genomic walkingexperiments.

SEQ ID NO: 6 is the nucleotide sequence of a promoter from the Q proteingene that is capable of directing expression of operably linkednucleotide sequences.

SEQ ID NO: 7 is a partial N-terminal amino acid sequence of an abundant27 kDa protein isolated from maize endosperm, which matches theamino-terminus of the 28 kDa maize glutelin-2 (γ-zein; GENBANK®Accession No. P04706).

SEQ ID NO: 8 is a partial N-terminal amino acid sequence of the abundant55 kDa protein isolated from maize endosperm.

SEQ ID NO: 9 is the amino acid sequence of the V5 epitope tag derivedfrom the P and V proteins of the paramyxovirus of simian virus 5 (SV5).

SEQ ID NO: 10 is the amino acid sequence of a pentapeptide epitope tag.

SEQ IS NO: 11 is a C-terminal hexapeptide sequence present onrecombinant Nov9x phytase.

SEQ ID NO: 12 is the amino acid sequence of the gene product of thephytase expression cassette.

SEQ ID NO: 13 is a nucleotide sequence of a Zea mays γ-zein promoter towhich 5′ Hind III and a 3′ BamH I recognition sequences have been added.

SEQ ID NO: 14 is a nucleotide sequence of pNOV4061, an expressionconstruct encoding a Nov9x phytase with a gamma zein signal sequenceunder the control of the gamma zein promoter.

SEQ ID NO: 15 is a nucleotide sequence of pNOV2117, an Agrobacteriumbinary vector encoding an E. coli manA phosphomannose isomerasepolypeptide under the transcriptional control of a maize ubiquitinpromoter.

SEQ ID NO: 16 is a nucleotide sequence of pNOV4325, an intermediateplasmid encoding a γ-zein-galA fusion protein separated by a 9nucleotide linker.

SEQ ID NO: 17 is a nucleotide sequence of pNOV4349, an Agrobacteriumbinary vector based on pNOV2117 into which a Q-protein coding sequencehas been inserted.

SEQ ID NO: 18 is a nucleotide sequence of pNOV4328, an intermediateplasmid encoding a γ-zein signal sequence/galA fusion protein under thetranscriptional control of a γ-zein promoter.

DETAILED DESCRIPTION

The presently disclosed subject matter will be now be described morefully hereinafter with reference to the accompanying Examples, in whichrepresentative embodiments of the presently disclosed subject matter areshown. The presently disclosed subject matter can, however, be embodiedin different forms and should not be construed as limited to theembodiments set forth herein. Rather, these embodiments are provided sothat this disclosure will be thorough and complete, and will fullyconvey the scope of the presently disclosed subject matter to thoseskilled in the art.

All publications, patent applications, patents, and other referencescited herein are incorporated by reference in their entireties.

I. General Considerations

A 55 kDa maize protein (designated Q protein) was identified, andappears to belong to the prolamin family of cereal seed storageproteins. The protein is released from endosperm flour in the presenceof dithiothreitol (DTT) and is the second most abundant protein in theseextracts after γ-zein, which by itself constitutes about 15% ofendosperm protein. An expressed sequence tag (EST) sequence encoding aportion of the protein was retrieved from the GENBANK® database. Usingoligonucleotide primers derived from the EST sequence, two overlappingcDNA clones were amplified. The combined clones encode a polypeptidechain of 308 residues including a predicted signal peptide of 19residues. The deduced sequence of the mature protein is 289 amino acidsand includes 104 Gin and 13 Glu. The combined percentage of Gin and Gluresidues is 40%, more than twice that reported for other zeins. Thepredicted sequence also includes 5 Lys, an amino acid that is completelylacking in all other zeins except δ-zein, which has just one. The γ-zeinprotein accumulates on the periphery of protein bodies, and theco-fractionation of the 55 kDa Q protein and γ-zein suggests that bothproteins accumulate in this region.

Previous efforts aimed at increasing Lys content of zeins for improvednutrition involved the insertion of 1-2 Lys residues into α-zein(Wallace et al., 1988) and up to 10 Lys residues into γ-zein (Torrent etal., 1987). The modified zeins accumulated in structures resemblingprotein bodies when synthesized in transient expression systems usingeither Xenopus oocytes or maize endosperm. However, both α-zein andγ-zein lack Lys and contain only 1-4 acidic amino acids that mightotherwise serve to neutralize positively charge residues. By contrast,the deduced sequence of the 55 kDa Q protein contains a total of 16acidic residues. The presence of 5 Lys in the 55 kDa protein and itsapparent localization at or near the surface of protein bodies suggeststhat its native conformation can tolerate substitution of additional Lysresidues.

II. Definitions

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the presently disclosed subject matter pertains. Forclarity of the present specification, certain definitions are presentedhereinbelow.

Following long-standing patent law convention, the terms “a”, “an”, and“the” refer to “one or more” when used in this application, including inthe claims. Thus, the phrases “a cell” and “the cell” refer to one ormore cells, unless the context is clearly to the contrary.

As used herein, the terms “associated with” and “operably linked” referto two nucleotide sequences that are related physically or functionally.For example, a promoter or regulatory DNA sequence is said to be“associated with” a DNA sequence that encodes an RNA or a polypeptide ifthe two sequences are operably linked, or situated such that theregulator DNA sequence will affect the expression level of the coding orstructural DNA sequence.

As used herein, the term “chimera” refers to a polypeptide thatcomprises domains or other features that are derived from differentpolypeptides or are in a position relative to each other that is notnaturally occurring.

As used herein, the term “chimeric construct” refers to a recombinantnucleic acid molecule in which a promoter or regulatory nucleotidesequence is operably linked to, or associated with, a nucleotidesequence that codes for an mRNA or which is expressed as a polypeptide,such that the regulatory nucleotide sequence is able to regulatetranscription or expression of the associated nucleotide sequence. Theregulatory nucleotide sequence of the chimeric construct is not normallyoperably linked to the associated nucleotide sequence as found innature.

As used herein, the terms “coding sequence” and “open reading frame”(ORF) are used interchangeably and refer to a nucleotide sequence thatis transcribed into RNA such as mRNA, rRNA, tRNA, snRNA, sense RNA, orantisense RNA. In some embodiments, the RNA is then translated in vivoor in vitro to produce a polypeptide. In some embodiments, an ORF of amaize Q protein comprises SEQ ID NO: 1.

As used herein, the term “complementary” refers to two nucleotidesequences that comprise antiparallel nucleotide sequences capable ofpairing with one another upon formation of hydrogen bonds between thecomplementary base residues in the antiparallel nucleotide sequences. Asis known in the art, the nucleotide sequences of two complementarystrands are the reverse complement of each other when each is viewed inthe 5′ to 3′ direction.

As is also known in the art, two sequences that hybridize to each otherunder a given set of conditions do not necessarily have to be 100% fullycomplementary. As used herein, the terms “fully complementary” and “100%complementary” refer to sequences for which the complementary regionsare 100% in Watson-Crick base-pairing: i.e., that no mismatches occurwithin the complementary regions. However, as is often the case withrecombinant molecules (for example, cDNAs) that are cloned into cloningvectors, certain of these molecules can have non-complementary overhangson either the 5′ or 3′ ends that result from the cloning event. In sucha situation, it is understood that the region of 100% or fullcomplementarity excludes any sequences that are added to the recombinantmolecule (typically at the ends) solely as a result of, or tofacilitate, the cloning event. Such sequences are, for example,polylinker sequences, linkers with restriction enzyme recognition sites,etc.

As used herein, the terms “domain” and “feature”, when used in referenceto a polypeptide or amino acid sequence, refers to a subsequence of anamino acid sequence that has a particular biological function. Domainsand features that have a particular biological function include, but arenot limited to, a signal sequence, a ligand binding domain, a nucleicacid binding domain, a catalytic domain, a substrate binding domain, anda polypeptide-polypeptide interacting domain. Similarly, when usedherein in reference to a nucleotide sequence, a “domain”, or “feature”is that subsequence of the nucleotide sequence that encodes a domain orfeature of a polypeptide.

As used herein, the term “expression cassette” refers to a nucleic acidmolecule capable of directing expression of a particular nucleotidesequence in an appropriate host cell, comprising a promoter operablylinked to the nucleotide sequence of interest which is operably linkedto termination signals. It also typically comprises sequences requiredfor proper translation of the nucleotide sequence. The coding regionusually encodes a polypeptide of interest but can also encode afunctional RNA of interest, for example antisense RNA or anon-translated RNA, in the sense or antisense direction. The expressioncassette comprising the nucleotide sequence of interest can be chimeric,meaning that at least one of its components is heterologous with respectto at least one of its other components. The expression cassette canalso be one that is naturally occurring but has been obtained in arecombinant form useful for heterologous expression. Typically, however,at least one component of the expression cassette is heterologous withrespect to the host; for example, a particular DNA sequence of theexpression cassette does not occur naturally in the host cell and wasintroduced into the host cell or an ancestor of the host cell by atransformation event. The expression of the nucleotide sequence in theexpression cassette can be under the control of a promoter (for example,the Q protein promoter of SEQ ID NO: 6), and in some embodiments, apromoter that initiates transcription only when the host cell is exposedto some particular external stimulus. In the case of a multicellularorganism such as a plant, the promoter can also be specific to aparticular tissue, organ, or stage of development (for example, a plantseed).

As used herein, the term “fragment” refers to a sequence that comprisesa subset of another sequence. When used in the context of a nucleic acidor amino acid sequence, the terms “fragment” and “subsequence” are usedinterchangeably. A fragment of a nucleotide sequence can be any numberof nucleotides that is less than that found in another nucleotidesequence, and thus includes, but is not limited to, the sequences of anexon or intron, a promoter, an enhancer, an origin of replication, a 5′or 3′ untranslated region, a coding region, and a polypeptide bindingdomain. It is understood that a fragment or subsequence can alsocomprise less than the entirety of a nucleotide sequence, for example, aportion of an exon or intron, promoter, enhancer, etc. Similarly, afragment or subsequence of an amino acid sequence can be any number ofresidues that is less than that found in a naturally occurringpolypeptide, and thus includes, but is not limited to, domains,features, repeats, etc. Also similarly, it is understood that a fragmentor subsequence of an amino acid sequence need not comprise the entiretyof the amino acid sequence of the domain, feature, repeat, etc. Afragment can also be a “functional fragment”, in which the fragmentretains a specific biological function of the nucleotide sequence oramino acid sequence of interest. For example, a functional fragment of atranscription factor can include, but is not limited to, a DNA bindingdomain, a transactivating domain, or both. Similarly, a functionalfragment of a receptor tyrosine kinase can include, but is not limitedto, a ligand binding domain, a kinase domain, an ATP binding domain, andcombinations thereof.

As used herein, the term “gene” is used broadly to refer to any segmentof DNA associated with a biological function. Thus, genes include, butare not limited to, coding sequences and/or the regulatory sequencesrequired for their expression. Genes can also include non-expressed DNAsegments that, for example, form recognition sequences for apolypeptide. Genes can be obtained from a variety of sources, includingcloning from a source of interest or synthesizing from known orpredicted sequence information, and can include sequences designed tohave desired parameters.

The terms “heterologous” and “recombinant”, when used herein to refer toa nucleotide sequence (e.g. a DNA sequence) or a gene, refer to asequence that originates from a source foreign to the particular hostcell or, if from the same source, is modified from its original form.Thus, a heterologous gene in a host cell includes a gene that isendogenous to the particular host cell but has been modified through,for example, the use of DNA shuffling or other recombinant techniques.The terms also include non-naturally occurring multiple copies of anaturally occurring DNA sequence. Thus, the terms refer to a DNA segmentthat is foreign to the host cell, or naturally occurring in the hostcell but in a position or form within the host cell in which the elementis not ordinarily found in nature. Similarly, when used in the contextof a polypeptide or amino acid sequence, a heterologous polypeptide oramino acid sequence is a polypeptide or amino acid sequence thatoriginates from a source foreign to the particular host cell (e.g., isgenerated from a heterologous coding sequence) or, if from the samesource, is modified from its original form. Thus, heterologous DNAsegments can be expressed to yield heterologous polypeptides.

A “homologous” nucleotide (or amino acid) sequence is a nucleotide (oramino acid) sequence naturally associated with a host cell into which itis introduced and that is present in the chromosomal or extrachromosomalposition in which it is normally found in nature.

The phrase “hybridizing specifically to” refers to the binding,duplexing, or hybridizing of a molecule only to a particular nucleotidesequence under stringent conditions when that sequence is present in acomplex mixture (e.g., total cellular) DNA or RNA. The phrase “bind(s)substantially” refers to complementary hybridization between a probenucleic acid and a target nucleic acid and embraces minor mismatchesthat can be accommodated by reducing the stringency of the hybridizationmedia to achieve the desired detection of the target nucleotidesequence.

As used herein, the terms “mutation” and “mutant” carry theirtraditional connotations and refer to a change, inherited, naturallyoccurring, or introduced, in a nucleic acid or polypeptide sequence, andare used in their senses as generally known to those of skill in theart.

As used herein, the term “inhibitor” refers to a chemical substance thatinactivates or decreases the biological activity of a polypeptide suchas a biosynthetic and catalytic activity, receptor, signal transductionpolypeptide, structural gene product, or transport polypeptide. The term“herbicide” (or “herbicidal compound”) is used herein to define aninhibitor applied to a plant at any stage of development, whereby theherbicide inhibits the growth of the plant or kills the plant.

As used herein, the term “isolated”, when used in the context of anisolated DNA molecule or an isolated polypeptide, is a DNA molecule orpolypeptide that, by the hand of man, exists apart from its nativeenvironment and is therefore not a product of nature. An isolated DNAmolecule or polypeptide can exist in a purified form or can exist in anon-native environment such as, for example, in a transgenic host cell.

As used herein, the term “mature polypeptide” refers to a polypeptidefrom which the transit peptide, signal peptide, and/or propeptideportions have been removed.

As used herein, the term “minimal promoter” refers to the smallest pieceof a promoter, such as a TATA element, that can support anytranscription. A minimal promoter typically has greatly reduced promoteractivity in the absence of upstream or downstream activation. In thepresence of a suitable transcription factor, a minimal promoter canfunction to permit transcription.

As used herein, the terms “cell”, “cell line”, and “cell culture” areused interchangeably, and all such designations include progeny. Thus,the words “transformants” and “transformed cells” include the primarysubject cell and cultures derived therefrom without regard for thenumber of transfers and/or rounds of cell division that the originallymanipulated cell or cells might have experienced. It is also understoodthat all progeny might not be precisely identical in DNA content, due todeliberate or inadvertent mutations. Mutant progeny that have the samefunction or biological activity as screened for in the originallytransformed cell are encompassed by the terms. Where distinctdesignations are intended, it will be clear from the context.

As used herein, the term “native” refers to a gene that is naturallypresent in the genome of an untransformed plant cell. Similarly, whenused in the context of a polypeptide, a “native polypeptide” is apolypeptide that is encoded by a native gene of an untransformed plantcell's genome.

As used herein, the term “naturally occurring” refers to an object thatis found in nature as distinct from being artificially produced by man.For example, a polypeptide or nucleotide sequence that is present in anorganism in its natural state, which has not been intentionally modifiedor isolated by man in the laboratory, is naturally occurring. As such, apolypeptide or nucleotide sequence is considered “non-naturallyoccurring” if it is encoded by or present within a recombinant molecule,even if the amino acid or nucleotide sequence is identical to an aminoacid or nucleotide sequence found in nature.

As used herein, the term “nucleic acid” refers to deoxyribonucleotidesor ribonucleotides and polymers thereof in either single- ordouble-stranded form. Unless specifically limited, the term encompassesnucleic acids containing known analogues of natural nucleotides thathave similar binding properties as the reference nucleic acid and aremetabolized in a manner similar to naturally occurring nucleotides.Unless otherwise indicated, a particular nucleotide sequence alsoimplicitly encompasses conservatively modified variants thereof (e.g.degenerate codon substitutions) and complementary sequences and as wellas the sequence explicitly disclosed. Specifically, degenerate codonsubstitutions can be achieved by generating sequences in which the thirdposition of one or more (or all) selected codons is substituted withmixed-base and/or deoxyinosine residues (Batzer et al., 1991; Ohtsuka etal., 1985; Rossolini et al., 1994). The terms “nucleic acid” or“nucleotide sequence” can also be used interchangeably with gene, cDNA,and mRNA encoded by a gene.

As used herein, the term “orthologs” refers to genes in differentspecies that encode protein that perform the same biological function.For example, the glucose-6-phosphate dehydrogenase genes from, forexample, sorghum and rice, are orthologs. Typically, orthologousnucleotide sequences are characterized by a high degree of sequencesimilarity (for example, at least about 90% sequence identity). Anucleotide sequence of an ortholog in one species (for example, rice)can be used to isolate the nucleotide sequence of the ortholog inanother species (for example, sorghum) using standard molecular biologytechniques. This can be accomplished, for example, using techniquesdescribed in more detail below (see also Sambrook & Russell, 2001 for adiscussion of hybridization conditions that can be used to isolateclosely related sequences).

As used herein, the phrase “percent identical”, in the context of twonucleic acid or polypeptide sequences, refers to two or more sequencesor subsequences that have in some embodiments 60% (e.g., 60, 63, 65, 67,or 69%), in some embodiments 70% (e.g., 70, 73, 75, 77, or 79%), in someembodiments 80% (e.g., 80, 83, 85, 87, or 89%), in some embodiments 90%(e.g., 90, 93, 95, or 97), and in some embodiments at least 99%nucleotide or amino acid residue identity, respectively, when comparedand aligned for maximum correspondence, as measured using one of thefollowing sequence comparison algorithms or by visual inspection. Thepercent identity exists in some embodiments over a region of thesequences that is at least about 50 residues in length, in someembodiments over a region of at least about 100 residues, and In someembodiments, the percent identity exists over at least about 150residues. In some embodiments, the percent identity exists over theentire length of the sequences.

For sequence comparison, typically one sequence acts as a referencesequence to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

Optimal alignment of sequences for comparison can be conducted, forexample, by the local homology algorithm disclosed in Smith & Waterman,1981, by the homology alignment algorithm disclosed in Needleman &Wunsch, 1970, by the search for similarity method disclosed in Pearson &Lipman, 1988, by computerized implementations of these algorithms (GAP,BESTFIT, FASTA, and TFASTA in the GCG® WISCONSIN PACKAGE®, availablefrom Accelrys, Inc., San Diego, Calif., United States of America), or byvisual inspection. See generally, Ausubel et al., 2002; Ausubel et al.,2003.

One example of an algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described in Altschul et al., 1990. Software for performing BLASTanalysis is publicly available through the website of the NationalCenter for Biotechnology Information. This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold. See generally, Altschul et al., 1990. Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when the cumulative alignment scorefalls off by the quantity X from its maximum achieved value, thecumulative score goes to zero or below due to the accumulation of one ormore negative-scoring residue alignments, or the end of either sequenceis reached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a wordlength (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix. See Henikoff & Henikoff, 1992.

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see e.g., Karlin & Altschul, 1993). One measure ofsimilarity provided by the BLAST algorithm is the smallest sumprobability (P(N)), which provides an indication of the probability bywhich a match between two nucleotide or amino acid sequences would occurby chance. For example, a test nucleotide sequence is considered similarto a reference sequence if the smallest sum probability in a comparisonof the test nucleotide sequence to the reference nucleotide sequence isin some embodiments less than about 0.1, in some embodiments less thanabout 0.01, and in some embodiments less than about 0.001.

As used herein, the terms “Q protein”, “55 kDa protein”, “55 kDa Qprotein”, and “55 kDa maize protein” are used interchangeably and referto a polypeptide comprising the amino acid sequence of SEQ ID NO: 2, andvariants, fragments, and domains thereof. A representative open readingframe for this polypeptide sequence has been identified, and has thenucleotide sequence provided in SEQ ID NO: 1. Additionally, a region ofthe maize promoter that controls transcription of this coding sequencein maize, referred to alternatively herein as the “Q protein genepromoter”, the “promoter from the Q protein gene”, etc., has beenidentified, and comprises the nucleotide sequence presented in SEQ IDNO: 6.

As used herein, the term “shuffled nucleic acid” refers to a recombinantnucleic acid molecule in which the nucleotide sequence comprises aplurality of nucleotide sequence fragments, wherein at least one of thefragments corresponds to a region of a nucleotide sequence listed in SEQID NO: 1, and wherein at least two of the plurality of sequencefragments are in an order, from 5′ to 3′, which is not an order in whichthe plurality of fragments naturally occur in a nucleic acid

The term “substantially identical”, in the context of two nucleotide oramino acid sequences, refers to two or more sequences or subsequencesthat have in some embodiments at least about 60% nucleotide or aminoacid identity (e.g., 60, 63, 65, 67, or 69% nucleotide or amino acididentity), in some embodiments at least about 70% nucleotide or aminoacid identity (e.g., 70, 73, 75, 78, or 79% nucleotide or amino acididentity), in some embodiments at least about 80% nucleotide or aminoacid identity (e.g., 80, 83, 85, 88, or 89% nucleotide or amino acididentity), and in some embodiments at least about 90% nucleotide oramino acid identity (e.g., 90, 93, 95, 98, or 99% nucleotide or aminoacid identity), when compared and aligned for maximum correspondence, asmeasured using one of the above-referenced sequence comparisonalgorithms or by visual inspection. In some embodiments, the substantialidentity exists in nucleotide or amino acid sequences of at least 50residues, in some embodiments in nucleotide or amino acid sequence of atleast about 100 residues, in some embodiments in nucleotide or aminoacid sequences of at least about 150 residues, and in some embodimentsin nucleotide or amino acid sequences comprising complete codingsequences or complete amino acid sequences.

In one aspect, polymorphic sequences can be substantially identicalsequences. The term “polymorphic” refers to the two or more geneticallydetermined alternative sequences or alleles in a population. An allelicdifference can be as small as one base pair. Nonetheless, one ofordinary skill in the art would recognize that the polymorphic sequencescorrespond to the same gene.

Another indication that two nucleotide sequences are substantiallyidentical is that the two molecules specifically or substantiallyhybridize to each other under conditions of medium or high stringency.In the context of nucleic acid hybridization, two nucleotide sequencesbeing compared can be designated a “probe sequence” and a “targetsequence”. A “probe sequence” is a reference nucleic acid molecule, anda “target sequence” is a test nucleic acid molecule, often found withina heterogeneous population, of nucleic acid molecules. A “targetsequence” is synonymous with a “test sequence”.

An exemplary nucleotide sequence employed for hybridization studies orassays includes probe sequences that are complementary to or mimic insome embodiments at least an about 14 to 40 nucleotide sequence of anucleic acid molecule of the presently disclosed subject matter. In someembodiments, probes comprise 14 to 20 nucleotides, or even longer wheredesired, such as 30, 40, 50, 60, 100, 200, 300, or 500 nucleotides or upto the full length (for example, the full complement) of any of thenucleotide sequence set forth in the SEQ ID NO: 1. Such fragments can bereadily prepared by, for example, directly synthesizing the fragment bychemical synthesis, by application of nucleic acid amplificationtechnology, or by introducing selected sequences into recombinantvectors for recombinant production.

The phrase “hybridizing substantially to” refers to complementaryhybridization between a probe nucleic acid molecule and a target nucleicacid molecule and embraces minor mismatches (for example, polymorphisms)that can be accommodated by reducing the stringency of the hybridizationand/or wash media to achieve the desired hybridization.

“Stringent hybridization conditions” and “stringent hybridization washconditions” in the context of nucleic acid hybridization experimentssuch as Southern and Northern blot analysis are both sequence- andenvironment-dependent. Longer sequences hybridize specifically at highertemperatures. An extensive guide to the hybridization of nucleic acidsis found in Tijssen, 1993. Generally, high stringency hybridization andwash conditions are selected to be about 5° C. lower than the thermalmelting point (T_(m)) for the specific sequence at a defined ionicstrength and pH. Typically, under “highly stringent conditions” a probewill hybridize specifically to its target subsequence, but to no othersequences. Similarly, medium stringency hybridization and washconditions are selected to be more than about 5° C. lower than the T_(m)for the specific sequence at a defined ionic strength and pH. Exemplarymedium stringency conditions include hybridizations and washes as forhigh stringency conditions, except that the temperatures for thehybridization and washes are in some embodiments 8° C., in someembodiments 10° C., in some embodiments 12° C., and in some embodiments15° C. lower than the T_(m) for the specific sequence at a defined ionicstrength and pH.

The T_(m) is the temperature (under defined ionic strength and pH) atwhich 50% of the target sequence hybridizes to a perfectly matchedprobe. Very stringent conditions are selected to be equal to the T_(m)for a particular probe. An example of highly stringent hybridizationconditions for Southern or Northern Blot analysis of complementarynucleic acids having more than about 100 complementary residues isovernight hybridization in 6× standard saline citrate (SSC) or standardsaline-phosphate-EDTA (SSPE) at 65° C. (or at 42° C. if 50% formamide isincluded in the hybridization buffer) containing 5× Denhardt's reagent,0.5% sodium dodecyl sulfate (SDS), 1 μg/ml poly(A), and 100 μg/ml salmonsperm DNA (50× Denhardt's reagent is 1% (w/v) Ficoll 400, 1% (w/v)polyvinylpyrrolidone, and 1% (w/v) bovine serum albumin; see Sambrookand Russell, 2001, for alternative hybridization and wash conditions andsolutions that can be used for the same). An example of highly stringentwash conditions is 15 minutes in 0.1×SSC, 0.1% (w/v) SDS at 65° C.Another example of highly stringent wash conditions is 15 minutes in0.2×SSC buffer at 65° C. Often, a high stringency wash is preceded by alower stringency wash to remove background probe signal. An example ofmedium stringency wash conditions for a duplex of more than about 100nucleotides is 15 minutes in 1×SSC at 45-55° C. Another example ofmedium stringency wash for a duplex of more than about 100 nucleotidesis 15 minutes in 4-6×SSC at 40° C. For short probes (e.g., about 10 to50 nucleotides), stringent conditions typically involve saltconcentrations of less than about 1 M Na⁺ ion, typically about 0.01 to 1M Na⁺ ion concentration (or other salts) at pH 7.0-8.3, and thetemperature is typically at least about 30° C. Stringent conditions canalso be achieved with the addition of destabilizing agents such asformamide. In general, a signal to noise ratio of 2-fold (or higher)than that observed for an unrelated probe in the particularhybridization assay indicates detection of a specific hybridization.

The following are examples of hybridization and wash conditions that canbe used to clone homologous nucleotide sequences that are substantiallyidentical to reference nucleotide sequences of the presently disclosedsubject matter: in some embodiments, a probe and target sequencehybridizes in 7% SDS, 0.5M NaPO₄, 1 mm ethylene diamine tetraacetic acid(EDTA) at 50° C. followed by washing in 2×SSC, 0.1% SDS at 50° C.; insome embodiments, a probe and target sequence hybridize in 7% SDS, 0.5 MNaPO₄, 1 mm EDTA at 50° C. followed by washing in 1×SSC, 0.1% SDS at 50°C.; in some embodiments, a probe and target sequence hybridize in 7%SDS, 0.5 M NaPO₄, 1 mm EDTA at 50° C. followed by washing in 0.5×SSC,0.1% SDS at 50° C.; in some embodiments, a probe and target sequencehybridize in 7% SDS, 0.5 M NaPO₄, 1 mm EDTA at 50° C. followed bywashing in 0.1×SSC, 0.1% SDS at 50° C.; in some embodiments, a probe andtarget sequence hybridize in 7% SDS, 0.5 M NaPO₄, 1 mm EDTA at 50° C.followed by washing in 0.1×SSC, 0.1% SDS at 55° C.; in some embodiments,a probe and target sequence hybridize in 7% SDS, 0.5 M NaPO₄, 1 mm EDTAat 50° C. followed by washing in 0.1×SSC, 0.1% SDS at 60° C.; and insome embodiments, a probe and target sequence hybridize in 7% SDS, 0.5 MNaPO₄, 1 mm EDTA at 50° C. followed by washing in 0.1×SSC, 0.1% SDS at65° C. In some embodiments, hybridization conditions comprisehybridization in a roller tube for at least 12 hours at 42° C. in 7%SDS, 0.5 M NaPO₄, 1 mm EDTA.

As used herein, the term “pre-polypeptide” refers to a polypeptide thatis normally targeted to a cellular organelle, such as a chloroplast, andstill comprises a transit peptide.

As used herein, the term “purified”, when applied to a nucleic acid orpolypeptide, denotes that the nucleic acid or polypeptide is essentiallyfree of other cellular components with which it is associated in thenatural state. It can be in a homogeneous state although it can be ineither a dry or aqueous solution. Purity and homogeneity are typicallydetermined using analytical chemistry techniques such as polyacrylamidegel electrophoresis or high performance liquid chromatography. Apolypeptide that is the predominant species present in a preparation issubstantially purified. The term “purified” denotes that a nucleic acidor polypeptide gives rise to essentially one band in an electrophoreticgel. Particularly, it means that the nucleic acid or polypeptide is insome embodiments at least about 50% pure, in some embodiments at leastabout 85% pure, and in some embodiments at least about 99% pure.

Two nucleic acids are “recombined” when sequences from each of the twonucleic acids are combined in a progeny nucleic acid. Two sequences are“directly” recombined when both of the nucleic acids are substrates forrecombination. Two sequences are “indirectly recombined” when thesequences are recombined using an intermediate such as a cross-overoligonucleotide. For indirect recombination, no more than one of thesequences is an actual substrate for recombination, and in some cases,neither sequence is a substrate for recombination.

As used herein, the term “regulatory elements” refers to nucleotidesequences involved in controlling the expression of a nucleotidesequence. Regulatory elements can comprise a promoter operably linked tothe nucleotide sequence of interest and termination signals. Regulatorysequences also include enhancers and silencers. They also typicallyencompass sequences required for proper translation of the nucleotidesequence.

As used herein, the term “transformation” refers to a process forintroducing heterologous DNA into a plant cell, plant tissue, or plant.Transformed plant cells, plant tissue, or plants are understood toencompass not only the end product of a transformation process, but alsotransgenic progeny thereof.

As used herein, the terms “transformed”, “transgenic”, and “recombinant”refer to a host cell or organism such as a bacterium or a plant cell(e.g., a plant) into which a heterologous nucleic acid molecule has beenintroduced. The nucleic acid molecule can be stably integrated into thegenome of the host or the nucleic acid molecule can also be present asan extrachromosomal molecule. Such an extrachromosomal molecule can beauto-replicating. Transformed cells, tissues, or plants are understoodto encompass not only the end product of a transformation process, butalso transgenic progeny thereof. A “non-transformed,” “non-transgenic”,or “non-recombinant” host refers to a wild-type organism, e.g., abacterium or plant, which does not contain the heterologous nucleic acidmolecule.

As used herein, the term “heterologous” as it relates to a nucleotidesequence (or an amino acid sequence encoded thereby) refers not only toa nucleic acid or amino acid sequence that is derived from a speciesother than the species into which it is introduced, but also refers to anucleotide sequence from the same species that is manipulated in thegenome of a cell or organism of that species such that the genomecontains some man made alteration.

Thus, the term “transgenic” refers not only to a cell of Zea mays, forexample, that comprises a nucleic acid molecule that is not naturallyoccurring in Zea mays, but also includes a cell of Zea mays, forexample, that comprises a nucleic acid molecule all or parts of whichare naturally occurring in Zea mays, but have been modified in some formsuch that the genome of the transgenic plant is identifiably differentfrom that of a naturally occurring Zea mays. In some embodiments, themodification comprises introducing one or more additional copies of aZea mays nucleotide sequence into a Zea mays cell. In some embodiments,the modification comprises “knocking in” a heterologous nucleotidesequence into the Q protein gene, such that the heterologous nucleotidesequence becomes operably linked to the endogenous Q protein genepromoter.

III. Nucleic Acid Molecules and Polypeptides

III.A. Nucleic Acid Molecules

Embodiments of the presently disclosed subject matter encompass isolatednucleic acid molecules corresponding to members of the prolamin familyof Zea mays cereal seed storage proteins, and a nucleotide sequence fromthe promoter region of one such family member that can be used tocontrol expression of operably linked nucleotide sequences. In someembodiments, an isolated nucleic acid molecule of the presentlydisclosed subject matter comprises a nucleotide sequence that hybridizesunder highly stringent conditions of hybridization of 65° C. in 6×SSC,followed by a final washing step of at least 15 minutes at 65° C. in0.1×SSC to a nucleotide sequence as set forth in SEQ ID NO: 1, or afragment, domain, or feature thereof. In some embodiments, an isolatednucleic acid molecule of the presently disclosed subject mattercomprises a nucleotide sequence having substantial identity to anucleotide sequence that hybridizes under highly stringent conditions ofhybridization of 65° C. in 6×SSC, followed by a final washing step of atleast 15 minutes at 65° C. in 0.1×SSC to a nucleotide sequence as setforth in SEQ ID NO: 1, or a fragment, domain, or feature thereof. Insome embodiments, the presently disclosed subject matter encompasses anisolated nucleic acid molecule comprising a nucleotide sequence that iscomplementary to, or the reverse complement of, a nucleotide sequencethat hybridizes under highly stringent conditions of hybridization of65° C. in 6×SSC, followed by a final washing step of at least 15 minutesat 65° C. in 0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, ora fragment, domain, or feature thereof. Some embodiments of thepresently disclosed subject matter encompass an isolated nucleic acidmolecule comprising a nucleotide sequence that is complementary to, orthe reverse complement of, a nucleotide sequence that has substantialidentity to, or is capable of hybridizing to, a nucleotide sequence thathybridizes under highly stringent conditions of hybridization of 65° C.in 6×SSC, followed by a final washing step of at least 15 minutes at 65°C. in 0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, or afragment, domain, or feature thereof.

In some embodiments, the substantial identity is at least about 60%identity (e.g., 60, 63, 65, 67, or 69% identity), in some embodiments atleast about 70% identity (e.g., 70, 73, 75, 77, or 79% identity), insome embodiments about 80% identity (e.g., 80, 83, 85, 87, or 89%identity), in some embodiments about 90% identity (e.g., 90 or 93%identity), in some embodiments about 95% identity, in some embodimentsabout 97% identity, and in some embodiments at least about 99% identityto the nucleotide sequence listed in SEQ ID NO: 1, or a fragment,domain, or feature thereof.

In some embodiments, the nucleotide sequence having substantial identitycomprises an allelic variant of the nucleotide sequence that hybridizesunder highly stringent conditions of hybridization of 65° C. in 6×SSC,followed by a final washing step of at least 15 minutes at 65° C. in0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, or a fragment,domain, or feature thereof. In some embodiments, the nucleotide sequencehaving substantial identity comprises a naturally occurring variant. Insome embodiments, the nucleotide sequence having substantial identitycomprises a polymorphic variant of the nucleotide sequence thathybridizes under highly stringent conditions of hybridization of 65° C.in 6×SSC, followed by a final washing step of at least 15 minutes at 65°C. in 0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, or afragment, domain, or feature thereof.

In some embodiments, the nucleic acid having substantial identitycomprises a deletion or insertion of at least one nucleotide. In someembodiments, the deletion or insertion comprises less than about thirtynucleotides. In some embodiments, the deletion or insertion comprisesless than about five nucleotides. In some embodiments, the sequence ofthe isolated nucleic acid having substantial identity comprises asubstitution in at least one codon. In some embodiments, thesubstitution is conservative.

In some embodiments, the isolated nucleic acid comprises a plurality ofregions having a nucleotide sequence that hybridizes under highlystringent conditions of hybridization of 65° C. in 6×SSC, followed by afinal washing step of at least 15 minutes at 65° C. in 0.1×SSC to anucleotide sequence listed in SEQ ID NO: 1, or an exon, domain, orfeature thereof.

In some embodiments, the sequence having substantial identity to thenucleotide sequence that hybridizes under highly stringent conditions ofhybridization of 65° C. in 6×SSC, followed by a final washing step of atleast 15 minutes at 65° C. in 0.1×SSC to a nucleotide sequence listed inSEQ ID NO: 1, or a fragment, domain, or feature thereof, is from aplant. In some embodiments, the plant is a dicot. In some embodiments,the plant is a gymnosperm. In some embodiments, the plant is a monocot.In some embodiments, the monocot is a cereal. In some embodiments, thecereal can be, for example, maize, wheat, barley, oats, rye, millet,sorghum, triticale, secale, einkorn, spelt, emmer, teff, milo, flax,gramma grass, Tripsacum sp., or teosinte. In some embodiments, thecereal is rice.

In some embodiments, the nucleic acid is expressed in a specificlocation or tissue of a plant. In some embodiments, the location ortissue includes, but is not limited to, epidermis, root, vasculartissue, meristem, cambium, cortex, pith, leaf, flower, see, andcombinations thereof. In some embodiments, the location or tissue is aseed. In some embodiments, the location or tissue is a protein body of aseed.

Embodiments of the presently disclosed subject matter also relate to ashuffled nucleic acid molecule comprising a plurality of nucleotidesequence fragments, wherein at least one of the fragments corresponds toa region of a nucleotide sequence that hybridizes under highly stringentconditions of hybridization of 65° C. in 6×SSC, followed by a finalwashing step of at least 15 minutes at 65° C. in 0.1×SSC to a nucleotidesequence listed in SEQ ID NO: 1, and wherein at least two of theplurality of sequence fragments are in an order, from 5′ to 3′, which isnot an order in which the plurality of fragments naturally occur. Insome embodiments, all of the fragments in a shuffled nucleic acidcomprising a plurality of nucleotide sequence fragments are from asingle gene. In some embodiments, the plurality of fragments is derivedfrom at least two different genes. In some embodiments, the shufflednucleic acid is operably linked to a promoter sequence. In someembodiments, the shuffled nucleic acid comprises a chimericpolynucleotide comprising a promoter sequence operably linked to theshuffled nucleic acid. In some embodiments, the shuffled nucleic acid iscontained within a host cell.

III.B. Identifying, Cloning, and Sequencing cDNAs

The cloning and sequencing of the cDNAs of the presently disclosedsubject matter is accomplished using techniques known in the art. Seegenerally, Sambrook & Russell, 2001; Silhavy et al., 1984; Ausubel etal., 2002; Ausubel et al., 2003; Reiter et al., 1992; Schultz et al.,1998.

The isolated nucleic acids and polypeptides of the presently disclosedsubject matter are usable over a range of plants—monocots and dicots—inparticular monocots such as sorghum, rice, wheat, barley, and maize. Insome embodiments, the monocot is a cereal. In some embodiments, thecereal can be, for example, maize, wheat, barley, oats, rye, millet,sorghum, triticale, secale, einkorn, spelt, emmer, teff, milo, flax,gramma grass, Tripsacum sp., or teosinte. In some embodiments, thecereal is maize. Other plant genera relevant to the presently disclosedsubject matter include, but are not limited to, Cucurbita, Rosa, Vitis,Juglans, Gragaria, Lotus, Medicago, Onobrychis, Trigonella, Vigna,Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica,Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersicon,Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciahorium, Helianthus,Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, Nemesis,Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis,Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lolium, Oryza, Avena,Hordeum, Secale, Allium, and Triticum.

The presently disclosed subject matter also provides a method forgenotyping a plant or plant part comprising a nucleic acid molecule ofthe presently disclosed subject matter. Optionally, the plant is amonocot such as, but not limited to, sorghum, rice, maize, or wheat.Genotyping provides a methodology for distinguishing homologs of achromosome pair and can be used to differentiate segregants in a plantpopulation. Molecular marker methods can be used in phylogeneticstudies, characterizing genetic relationships among crop varieties,identifying crosses or somatic hybrids, localizing chromosomal segmentsaffecting monogenic traits, mapping based cloning, and the study ofquantitative inheritance (see Clark, 1997; Paterson, 1996).

The method for genotyping can employ any number of molecular markeranalytical techniques including, but not limited to, restriction lengthpolymorphisms (RFLPs). As is well known in the art, RFLPs are producedby differences in the DNA restriction fragment lengths resulting fromnucleotide differences between alleles of the same gene. Thus, thepresently disclosed subject matter provides a method for followingsegregation of a gene or nucleic acid of the presently disclosed subjectmatter or chromosomal sequences genetically linked by using RFLPanalysis. Linked chromosomal sequences are in some embodiments within 50centimorgans (cM), in some embodiments within 40 cM, in some embodimentswithin 30 cM, in some embodiments within 20 cM, in some embodimentswithin 10 cM, and in some embodiments within 5, 3, 2, or 1 cM of thenucleic acid of the presently disclosed subject matter.

Embodiments of the presently disclosed subject matter also relate to anisolated nucleic acid molecule comprising a nucleotide sequence, itscomplement (for example, its full complement), or its reverse complement(for example, its full reverse complement), the nucleotide sequenceencoding a polypeptide (for example, a biologically active polypeptideor biologically active fragment). In some embodiments, the nucleotidesequence encodes a polypeptide that is an ortholog of a polypeptidecomprising a polypeptide sequence listed in SEQ ID NO: 2, or a fragment,domain, repeat, feature, or chimera thereof. In some embodiments, thenucleotide sequence encodes a polypeptide that is an ortholog of apolypeptide comprising a polypeptide sequence having substantialidentity to a polypeptide sequence listed in SEQ ID NO: 2, or afragment, domain, repeat, feature, or chimera thereof. In someembodiments, the nucleotide sequence encodes a polypeptide that is anortholog of a polypeptide comprising a polypeptide sequence encoded by anucleotide sequence identical to or having substantial identity to anucleotide sequence listed in SEQ ID NO: 1, or a fragment, domain, orfeature thereof, or a sequence complementary thereto. In someembodiments, the nucleotide sequence encodes a polypeptide comprising apolypeptide sequence encoded by a nucleotide sequence that hybridizesunder highly stringent conditions of hybridization of 65° C. in 6×SSC,followed by a final washing step of at least 15 minutes at 65° C. in0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, or to asequence complementary thereto. In some embodiments, the nucleotidesequence encodes a functional fragment of a polypeptide of the presentlydisclosed subject matter.

In some embodiments, the isolated nucleic acid comprises apolypeptide-encoding sequence. In some embodiments, thepolypeptide-encoding sequence encodes a polypeptide that is an orthologof a polypeptide comprising a polypeptide sequence listed in SEQ ID NO:2, or a fragment thereof. In some embodiments, the polypeptide is aplant polypeptide. In some embodiments, the plant is a dicot. In someembodiments, the plant is a gymnosperm. In some embodiments, the plantis a monocot. In some embodiments, the monocot is a cereal. In someembodiments, the cereal includes, but is not limited to, maize, wheat,barley, oats, rye, millet, sorghum, triticale, secale, einkorn, spelt,emmer, teff, miloflax, gramma grass, Tripsacum, and teosinte. In someembodiments, the cereal is maize.

Embodiments of the presently disclosed subject matter also relate to anisolated nucleic acid molecule comprising a nucleotide sequence, itscomplement (for example, its full complement), or its reverse complement(for example, its full reverse complement), encoding a polypeptideselected from a group comprising one or more of:

-   -   (a) a polypeptide sequence encoded by a nucleotide sequence that        hybridizes under highly stringent conditions of hybridization of        65° C. in 6×SSC, followed by a final washing step of at least 15        minutes at 65° C. in 0.1×SSC to a nucleotide sequence listed in        SEQ ID NO: 1, or a fragment, domain, or feature thereof, or a        sequence complementary thereto; and    -   (b) a functional fragment of (a).

In some embodiments, the polypeptide having substantial identitycomprises an allelic variant of a polypeptide that is an ortholog of apolypeptide having an amino acid sequence listed in SEQ ID NO: 2, or afragment, domain, repeat, feature, or chimera thereof. In someembodiments, the isolated nucleic acid comprises a plurality of regionsfrom the polypeptide sequence encoded by a nucleotide sequence thathybridizes under highly stringent conditions of hybridization of 65° C.in 6×SSC, followed by a final washing step of at least 15 minutes at 65°C. in 0.1×SSC to a nucleotide sequence listed in SEQ ID NO: 1, or afragment, domain, or feature thereof, or a sequence complementarythereto.

III.C. Polypeptides

The presently disclosed subject matter further relates to isolatedpolypeptides that are orthologs of the polypeptide comprising the aminoacid sequences set forth in SEQ ID NO: 2, including biologically activepolypeptides. In some embodiments, the polypeptide comprises apolypeptide sequence of an ortholog of a polypeptide listed in SEQ IDNO: 2. In some embodiments, the polypeptide comprises a functionalfragment or domain of an ortholog of a polypeptide comprising apolypeptide sequence listed in SEQ ID NO: 2. In some embodiments, thepolypeptide comprises a chimera of an ortholog of the polypeptidesequence listed in SEQ ID NO: 2, where the chimera can comprisefunctional polypeptide motifs, including domains, repeats,post-translational modification sites, or other features. In someembodiments, the polypeptide is a plant polypeptide. In someembodiments, the plant is a dicot. In some embodiments, the plant is agymnosperm. In some embodiments, the plant is a monocot. In someembodiments, the monocot is a cereal. In some embodiments, the cerealis, for example, maize, wheat, barley, oats, rye, millet, sorghum,triticale, secale, einkorn, spelt, emmer, teff, milo, flax, grammagrass, Tripsacum, or teosinte. In some embodiments, the cereal is maize.

In some embodiments, the polypeptide is expressed in a specific locationor tissue of a plant. In some embodiments, the location or tissueincludes, but is not limited to, epidermis, root, vascular tissue,meristem, cambium, cortex, pith, leaf, flower, seed, and combinationsthereof. In some embodiments, the location or tissue is a seed. In someembodiments, the location or tissue is a protein body of a seed.

In some embodiments, isolated polypeptides comprise the amino acidsequences of orthologs of the polypeptides comprising the amino acidsequence set forth in SEQ ID NO: 2, and variants having conservativeamino acid modifications. The term “conservative modified variants”refers to polypeptides that can be encoded by nucleotide sequenceshaving degenerate codon substitutions wherein at least one position ofone or more selected (or all) codons is substituted with mixed-baseand/or deoxyinosine residues (Batzer et al., 1991; Ohtsuka et al., 1985;Rossolini et al., 1994). Additionally, one skilled in the art willrecognize that individual substitutions, deletions, or additions to anucleic acid, peptide, polypeptide, or polypeptide sequence that alters,adds, or deletes a single amino acid or a small percentage of aminoacids in the encoded sequence is a “conservative modification” where themodification results in the substitution of an amino acid with achemically similar amino acid. Conservative modified variants providesimilar biological activity as the unmodified polypeptide. Conservativesubstitution tables listing functionally similar amino acids are knownin the art. See Creighton, 1984.

The term “conservatively modified variant” also refers to a peptidehaving an amino acid residue sequence substantially identical to asequence of a polypeptide of the presently disclosed subject matter inwhich one or more residues have been conservatively substituted with afunctionally similar residue. Examples of conservative substitutionsinclude the substitution of one non-polar (hydrophobic) residue such asisoleucine, valine, leucine or methionine for another; the substitutionof one polar (hydrophilic) residue for another such as between arginineand lysine, between glutamine and asparagine, between glycine andserine; the substitution of one basic residue such as lysine, arginineor histidine for another; or the substitution of one acidic residue,such as aspartic acid or glutamic acid for another.

Amino acid substitutions, such as those which might be employed inmodifying the polypeptides described herein, are generally based on therelative similarity of the amino acid side-chain substituents, forexample, their hydrophobicity, hydrophilicity, charge, size, and thelike. An analysis of the size, shape and type of the amino acidside-chain substituents reveals that arginine, lysine and histidine areall positively charged residues; that alanine, glycine and serine areall of similar size; and that phenylalanine, tryptophan and tyrosine allhave a generally similar shape. Therefore, based upon theseconsiderations, arginine, lysine and histidine; alanine, glycine andserine; and phenylalanine, tryptophan and tyrosine; are defined hereinas biologically functional equivalents. Other biologically functionallyequivalent changes will be appreciated by those of skill in the art.

In making biologically functional equivalent amino acid substitutions,the hydropathic index of amino acids can be considered. Each amino acidhas been assigned a hydropathic index on the basis of theirhydrophobicity and charge characteristics, these are: isoleucine (+4.5);valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine (+2.5);methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7);serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline (−1.6);histidine (−3.2); glutamate (−3.5); glutamine (−3.5); aspartate (−3.5);asparagine (−3.5); lysine (−3.9); and arginine (4.5).

The importance of the hydropathic amino acid index in conferringinteractive biological function on a protein is generally understood inthe art (Kyte & Doolittle, 1982, incorporated herein by reference). Itis known that certain amino acids can be substituted for other aminoacids having a similar hydropathic index or score and still retain asimilar biological activity. In making changes based upon thehydropathic index, substitutions of amino acids can involve amino acidsfor which the hydropathic indices are in some embodiments within ±2 ofthe original value, in some embodiments within ±1 of the original value,and in some embodiments within ±0.5 of the original value.

It is also understood in the art that the substitution of like aminoacids can be made effectively on the basis of hydrophilicity. U.S. Pat.No. 4,554,101, incorporated herein by reference, states that thegreatest local average hydrophilicity of a protein, as governed by thehydrophilicity of its adjacent amino acids, correlates with itsimmunogenicity and antigenicity, i.e. with a biological property of theprotein. It is understood that an amino acid can be substituted foranother having a similar hydrophilicity value and still obtain abiologically equivalent protein.

As detailed in U.S. Pat. No. 4,554,101, the following hydrophilicityvalues have been assigned to amino acid residues: arginine (+3.0);lysine (+3.0); aspartate (+3.0±1); glutamate (+3.0±1); serine (+0.3);asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4);proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0);methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8);tyrosine (−2.3); phenylalanine (−2.5); tryptophan (−3.4).

In making changes based upon similar hydrophilicity values,substitutions of amino acids can involve amino acids for which thehydrophilicity values are in some embodiments within ±2 of the originalvalue, in some embodiments within ±1 of the original value, and in someembodiments within ±0.5 of the original value.

While discussion has focused on functionally equivalent polypeptidesarising from amino acid changes, it will be appreciated that thesechanges can be effected by alteration of the encoding DNA, taking intoconsideration also that the genetic code is degenerate and that two ormore codons can code for the same amino acid.

In some embodiments, the sequence having substantial identity contains adeletion or insertion of at least one amino acid. In some embodiments,the deletion or insertion is of less than about ten amino acids. In someembodiments, the deletion or insertion is of less than about three aminoacids.

In some embodiments, the sequence having substantial identity encodes asubstitution in at least one amino acid.

Embodiments of the presently disclosed subject matter also provide anisolated polypeptide comprising a polypeptide sequence selected from thegroup consisting of:

-   -   (a) a polypeptide sequence having substantial identity to a        polypeptide sequence listed in SEQ ID NO: 2, or a domain or        feature thereof;    -   (b) a polypeptide sequence encoded by a nucleotide sequence        identical to or having substantial identity to a nucleotide        sequence that hybridizes under highly stringent conditions of        hybridization of 65° C. in 6×SSC, followed by a final washing        step of at least 15 minutes at 65° C. in 0.1×SSC to a nucleotide        sequence listed in SEQ ID NO: 1, or an exon, domain, or feature        thereof, or a sequence complementary thereto; and    -   (c) a functional fragment of (a) or (b).

In some embodiments, a polypeptide having substantial identity to apolypeptide sequence listed in SEQ ID NO: 2, or a domain or featurethereof, is an allelic variant of the polypeptide sequence listed in SEQID NO: 2. In some embodiments, a polypeptide having substantial identityto a polypeptide sequence listed in SEQ ID NO: 2, or a domain or featurethereof, is a naturally occurring variant of the polypeptide sequencelisted in SEQ ID NO: 2. In some embodiments, a polypeptide havingsubstantial identity to a polypeptide sequence listed in SEQ ID NO: 2,or a domain or feature thereof, is a polymorphic variant of thepolypeptide sequence listed in SEQ ID NO: 2.

In some embodiments, the polypeptide is an ortholog of a polypeptidecomprising the amino acid sequence listed in SEQ ID NO: 2. In someembodiments, the polypeptide is a functional fragment or domain of anortholog of a polypeptide comprising the amino acid sequence listed inSEQ ID NOs: 2. In some embodiments, the polypeptide is a chimera, wherethe chimera comprises a functional polypeptide domain, including, butnot limited to, a domain, a repeat, a post-translational modificationsite, and combinations thereof. In some embodiments the polypeptide is aplant polypeptide. In some embodiments, the plant is a dicot. In someembodiments, the plant is a gymnosperm. In some embodiments, the plantis a monocot. In some embodiments, the monocot is a cereal. In someembodiments, the cereal can be, for example, maize, wheat, barley, oats,rye, millet, sorghum, triticale, secale, einkorn, spelt, emmer, teff,milo, flax, gramma grass, Tripsacum, or teosinte. In some embodiments,the cereal is maize.

In some embodiments, the polypeptide is expressed in a specific locationor tissue of a plant. In some embodiments, the location or tissueincludes, but is not limited to, epidermis, vascular tissue, meristem,cambium, cortex, or pith. In some embodiments, the location or tissue isleaf or sheath, root, flower, and developing ovule or seed. In someembodiments, the location or tissue can be, for example, epidermis,root, vascular tissue, meristem, cambium, cortex, pith, leaf, or flower.In some embodiments, the location or tissue is a seed.

In some embodiments, the polypeptide sequence is encoded by a nucleotidesequence that hybridizes under highly stringent conditions ofhybridization of 65° C. in 6×SSC, followed by a final washing step of atleast 15 minutes at 65° C. in 0.1×SSC to the nucleotide sequence of SEQID NO: 1, or a fragment, domain, or feature thereof or a sequencecomplementary thereto, wherein the nucleotide sequence includes adeletion or insertion of at least one nucleotide. In some embodiments,the deletion or insertion is of less than about thirty nucleotides. Insome embodiments, the deletion or insertion is of less than about fivenucleotides. In some embodiments, the polypeptide sequence encoded by anucleotide sequence that hybridizes under highly stringent conditions ofhybridization of 65° C. in 6×SSC, followed by a final washing step of atleast 15 minutes at 65° C. in 0.1×SSC to the nucleotide sequence of SEQID NO: 1, or a fragment, domain, or feature thereof or a sequencecomplementary thereto, includes a substitution of at least one codon. Insome embodiments, the substitution is conservative. In some embodiments,the polypeptide sequences having substantial identity to the polypeptidesequence of SEQ ID NO: 2, ora fragment, domain, repeat, feature, orchimera thereof, includes a deletion or insertion of at least one aminoacid.

The polypeptides of the presently disclosed subject matter, fragmentsthereof, or variants thereof, can comprise any number of contiguousamino acid residues from a polypeptide of the presently disclosedsubject matter, wherein the number of residues is selected from thegroup of integers consisting of from 10 to the number of residues in afull-length polypeptide of the presently disclosed subject matter. Insome embodiments, the portion or fragment of the polypeptide is afunctional polypeptide. The presently disclosed subject matter includesactive polypeptides having specific activity of at least in someembodiments 20%, in some embodiments 30%, in some embodiments 40%, insome embodiments 50%, in some embodiments 60%, in some embodiments 70%,in some embodiments 80%, in some embodiments 90%, and in someembodiments 95% that of the native (non-synthetic) endogenouspolypeptide. Further, the substrate specificity (k_(cat)/K_(m)) can besubstantially identical to the native (non-synthetic), endogenouspolypeptide. Typically the K_(m) will be at least in some embodiments30%, in some embodiments 40%, in some embodiments 50% of the native,endogenous polypeptide; and in some embodiments at least 60%, in someembodiments 70%, in some embodiments 80%, and in some embodiments 90% ofthe native, endogenous polypeptide. Methods of assaying and quantifyingmeasures of activity and substrate specificity are well known to thoseof skill in the art.

The isolated polypeptides of the presently disclosed subject matter canelicit production of an antibody specifically reactive to a polypeptideof the presently disclosed subject matter when presented as animmunogen. Therefore, the polypeptides of the presently disclosedsubject matter can be employed as immunogens for constructing antibodiesimmunoreactive to a polypeptide of the presently disclosed subjectmatter for such purposes including, but not limited to, immunoassays orpolypeptide purification techniques. Immunoassays for determiningbinding are well known to those of skill in the art and include, but arenot limited to enzyme-linked immunosorbent assays (ELISA) andcompetitive immunoassays.

Embodiments of the presently disclosed subject matter also relate tochimeric polypeptides encoded by the isolated nucleic acid molecules ofthe present disclosure including a chimeric polypeptide containing apolypeptide sequence encoded by an isolated nucleic acid containing anucleotide sequence selected from the group consisting of:

-   -   (a) a nucleotide sequence that hybridizes under highly stringent        conditions of hybridization of 65° C. in 6×SSC, followed by a        final washing step of at least 15 minutes at 65° C. in 0.1×SSC        to a nucleotide sequence listed in SEQ ID NO: 1, or an exon,        domain, or feature thereof;    -   (b) a nucleotide sequence complementary (for example, fully        complementary) to (a); and    -   (c) a nucleotide sequence which is the reverse complement (for        example, full reverse complement) of (a);    -   (d) or a functional fragment thereof.        IV. Controlling and Altering the Expression of Nucleic Acid        Molecules

IV.A. General Considerations

One aspect of the presently disclosed subject matter providescompositions and methods for altering (i.e. increasing or decreasing)the level of nucleic acid molecules and/or polypeptides of the presentlydisclosed subject matter in plants. In particular, the nucleic acidmolecules and polypeptides of the presently disclosed subject matter areexpressed constitutively, temporally, or spatially (e.g. atdevelopmental stages), in certain tissues, and/or quantities, which areuncharacteristic of non-recombinantly engineered plants. Therefore, thepresently disclosed subject matter provides utility in such exemplaryapplications as altering the specified characteristics identified above.

The isolated nucleic acid molecules of the presently disclosed subjectmatter are useful for expressing a polypeptide of the presentlydisclosed subject matter in a recombinantly engineered cell such as abacterial, yeast, insect, mammalian, or plant cell. Expressing cells canproduce the polypeptide in a non-natural condition (e.g. in quantity,composition, location, and/or time) because they have been geneticallyaltered to do so. Those skilled in the art are knowledgeable in thenumerous expression systems available for expression of nucleic acidsencoding a polypeptide of the presently disclosed subject matter.

Embodiments of the presently disclosed subject matter provide anexpression cassette comprising a promoter sequence operably linked to anisolated nucleic acid, the isolated nucleic acid comprising:

-   -   (a) a nucleotide sequence that hybridizes under highly stringent        conditions of hybridization of 65° C. in 6×SSC, followed by a        final washing step of at least 15 minutes at 65° C. in 0.1×SSC        to a nucleotide sequence listed in SEQ ID NO: 1, or a fragment,        domain, or feature thereof;    -   (b) a nucleotide sequence complementary (for example, fully        complementary) to (a); and    -   (c) a nucleotide sequence that is the reverse complement (for        example, the full reverse complement) of (a).

Further encompassed within the presently disclosed subject matter is arecombinant vector comprising an expression cassette according to theembodiments of the presently disclosed subject matter. Also encompassedare plant cells comprising expression cassettes according to the presentdisclosure, and plants comprising these plant cells. In someembodiments, the plant is a dicot. In some embodiments, the plant is agymnosperm. In some embodiments, the plant is a monocot. In someembodiments, the monocot is a cereal. In some embodiments, the cerealis, for example, maize, wheat, barley, oats, rye, millet, sorghum,triticale, secale, einkorn, spelt, emmer, teff, milo, flax, grammagrass, Tripsacum or teosinte. In some embodiments, the cereal is maize.

In some embodiments, the expression cassette is expressed throughout theplant. In some embodiments, the expression cassette is expressed in aspecific location or tissue of a plant. In some embodiments, thelocation or tissue includes, but is not limited to, epidermis, root,vascular tissue, meristem, cambium, cortex, pith, leaf, flower, seed,and combinations thereof. In some embodiments, the location or tissue isa seed. In some embodiments, the location or tissue is a protein body ofa seed.

Embodiments of the presently disclosed subject matter also relate to anexpression vector comprising a nucleic acid molecule selected from thegroup consisting of:

-   -   (a) a nucleic acid encoding a polypeptide as listed in SEQ ID        NO: 2, or ortholog thereof;    -   (b) a fragment, domain, or featured region of a nucleotide        sequence that hybridizes under highly stringent conditions of        hybridization of 65° C. in 6×SSC, followed by a final washing        step of at least 15 minutes at 65° C. in 0.1×SSC to a nucleotide        sequence listed in SEQ ID NO: 1; and    -   (c) a complete nucleotide sequence that hybridizes under highly        stringent conditions of hybridization of 65° C. in 6×SSC,        followed by a final washing step of at least 15 minutes at        65° C. in 0.1×SSC to a nucleotide sequence listed in SEQ ID NO:        1, or a fragment thereof, in combination with a heterologous        sequence.

In some embodiments, the expression vector comprises one or moreelements including, but not limited to, a promoter-enhancer sequence, aselection marker sequence, an origin of replication, an epitopetag-encoding sequence, and an affinity purification tag-encodingsequence. In some embodiments, the promoter-enhancer sequence comprises,for example, the cauliflower mosaic virus (CaMV) 35S promoter, the CaMV19S promoter, the tobacco PR-1a promoter, the ubiquitin promoter, or thephaseolin promoter. In some embodiments, the promoter is operable inplants, and in some embodiments, the promoter is a constitutive orinducible promoter. In some embodiments, the selection marker sequenceencodes an antibiotic resistance gene. In some embodiments, the epitopetag sequence encodes the V5 epitope tag (GKPIPNPLLGLDST; SEQ ID NO: 9;Southern et al., 1991), the peptide FHHTT (SEQ ID NO: 10), hemaglutinin,or glutathione-5-transferase. In some embodiments the affinitypurification tag sequence encodes a polyamino acid sequence or apolypeptide. In some embodiments, the polyamino acid sequence comprisespolyhistidine. In some embodiments, the polypeptide is chitin-bindingdomain or glutathione-S-transferase. In some embodiments, the affinitypurification tag sequence comprises an intein encoding sequence.

In some embodiments, the expression vector comprises a eukaryoticexpression vector, and in some embodiments, the expression vectorcomprises a prokaryotic expression vector. In some embodiments, theeukaryotic expression vector comprises a tissue-specific promoter. Insome embodiments, the expression vector is operable in plants.

Embodiments of the presently disclosed subject matter also relate to acell comprising a nucleic acid construct comprising an expression vectorand a nucleic acid comprising a nucleic acid encoding a polypeptide thatis an ortholog of a polypeptide as listed in SEQ ID NO: 2, or anucleotide sequence that hybridizes under highly stringent conditions ofhybridization of 65° C. in 6×SSC, followed by a final washing step of atleast 15 minutes at 65° C. in 0.1×SSC to a nucleotide sequence listed inSEQ ID NO: 1, or a subsequence thereof, in combination with aheterologous sequence.

In some embodiments, the cell is a bacterial cell, a fungal cell, aplant cell, or an animal cell. In some embodiments, the polypeptide isexpressed in a specific location or tissue of a plant. In someembodiments, the location or tissue includes, but is not limited to,epidermis, root, vascular tissue, meristem, cambium, cortex, pith, leaf,flower, seed, and combinations thereof. In some embodiments, thelocation or tissue is a seed.

Prokaryotic cells including, but not limited to, Escherichia coli andother microbial strains known to those in the art, can be used a hostcells. Methods for expressing polypeptides in prokaryotic cells are wellknown to those in the art and can be found in many laboratory manualssuch as Sambrook & Russell, 2001. A variety of promoters, ribosomebinding sites, and operators to control expression are available tothose skilled in the art, as are selectable markers such as antibioticresistance genes. The type of vector is chosen to allow for optimalgrowth and expression in the selected cell type.

A variety of eukaryotic expression systems are available such as, forexample, yeast, insect cell lines, plant cells, and mammalian cells.Expression and synthesis of heterologous polypeptides in yeast is wellknown (see Sherman et al., 1982). Yeast strains widely used forproduction of eukaryotic polypeptides are Saccharomyces cerevisiae andPichia pastoris, and vectors, strains, and protocols for expression areavailable from commercial suppliers (e.g., Invitrogen Corp., Carlsbad,Calif., United States of America).

Mammalian cell systems can be transformed with expression vectors forproduction of polypeptides. Suitable host cell lines available to thosein the art include, but are not limited to, the HEK293, BHK21, and CHOcells lines. Expression vectors for these cells can include expressioncontrol sequences such as an origin of replication, a promoter, (e.g.,the CMV promoter, a Herpes Simplex Virus thymidine kinase (HSV-tk)promoter or phosphoglycerate kinase (pgk) promoter), an enhancer, andpolypeptide processing sites such as ribosome binding sites, RNA splicesites, polyadenylation sites, and transcription terminator sequences.Other animal cell lines useful for the production of polypeptides areavailable commercially or from depositories such as the American TypeCulture Collection (Manassas, Va., United States of America).

Expression vectors for expressing polypeptides in insect cells areusually derived from baculovirus or other viruses known in the art. Anumber of suitable insect cell lines are available including, but notlimited to, mosquito larvae, silkworm, armyworm (for example, Spodopterafrugiperda), moth, and Drosophila cell lines.

Methods of transforming animal and lower eukaryotic cells are known.Numerous methods can be used to introduce heterologous DNA intoeukaryotic cells including, but not limited to, calcium phosphateprecipitation, fusion of the recipient cell with bacterial protoplastscontaining the DNA, treatment of the recipient cells with liposomescontaining the DNA, DEAE dextran, electroporation, biolistics, andmicroinjection of the DNA directly into the cells. Transformed cells arecultured using means well known in the art (see Kuchler, 1997).

Once a polypeptide of the presently disclosed subject matter isexpressed it can be isolated and purified from the expressing cellsusing methods known to those skilled in the art. The purificationprocess can be monitored using Western blot techniques,radioimmunoassay, or other standard immunoassay techniques. Polypeptidepurification techniques are commonly known and used by those skilled inthe art (see Scopes, 1982; Deutscher, 1990).

Embodiments of the presently disclosed subject matter provide a methodfor producing a recombinant polypeptide in which the expression vectorcomprise one or more elements including, but not limited to, apromoter-enhancer sequence, a selection marker sequence, an origin ofreplication, an epitope tag-encoding sequence, and an affinitypurification tag-encoding sequence. In some embodiments, the nucleicacid construct comprises an epitope tag-encoding sequence and theisolating step employs an antibody specific for the epitope tag. In someembodiments, the nucleic acid construct comprises a polyaminoacid-encoding sequence and the isolating step employs a resin comprisinga polyamino acid binding substance, in some embodiments where thepolyamino acid is polyhistidine and the polyamino acid binding resin isnickel-charged agarose resin. In some embodiments, the nucleic acidconstruct comprises a polypeptide-encoding sequence and the isolatingstep employs a resin comprising a polypeptide binding substance. In someembodiments, the polypeptide is a chitin-binding domain and the resincontains chitin-sepharose.

The polypeptides of the presently disclosed subject matter can besynthesized using non-cellular synthetic methods known to those in theart. Techniques for solid phase synthesis are disclosed in Barany &Merrifield, 1980; Merrifield et al., 1963; Stewart & Young, 1984.

The presently disclosed subject matter further provides a method formodifying (i.e. increasing or decreasing) the concentration orcomposition of a polypeptide of the presently disclosed subject matterin a plant or part thereof. Modification can be effected by increasingor decreasing the concentration and/or the composition (i.e. the rationof the polypeptides of the presently disclosed subject matter) in aplant. The method comprises introducing into a plant cell an expressioncassette comprising a nucleic acid molecule of the presently disclosedsubject matter as disclosed above to obtain a transformed plant cell ortissue, and culturing the transformed plant cell or tissue. The nucleicacid molecule can be under the regulation of a constitutive or induciblepromoter. The method can further comprise inducing or repressingexpression of a nucleic acid molecule of a sequence in the plant for atime sufficient to modify the concentration and/or composition in theplant or plant part.

A plant or plant part having modified expression of a nucleic acidmolecule of the presently disclosed subject matter can be analyzed andselected using methods known to those skilled in the art including, butnot limited to, Southern blotting, DNA sequencing, or PCR analysis usingprimers specific to the nucleic acid molecule and detecting ampliconsproduced therefrom.

In general, a concentration or composition is increased or decreased byat least in some embodiments 5%, in some embodiments 10%, in someembodiments 20%, in some embodiments 30%, in some embodiments 40%, insome embodiments 50%, in some embodiments 60%, in some embodiments 70%,in some embodiments 80%, and in some embodiments 90% relative to anative control plant, plant part, or cell lacking the expressioncassette.

IV.B. Homologous Recombination

In some embodiments, at least one genomic copy corresponding to anucleotide sequence of the presently disclosed subject matter ismodified in the genome of the plant by homologous recombination asfurther illustrated in Paszkowski et al., 1988. This technique uses theability of homologous sequences to recognize each other and to exchangenucleotide sequences between respective nucleic acid molecules by aprocess known in the art as homologous recombination. Homologousrecombination can occur between the chromosomal copy of a nucleotidesequence in a cell and an incoming copy of the nucleotide sequenceintroduced in the cell by transformation. Specific modifications arethus accurately introduced in the chromosomal copy of the nucleotidesequence. In some embodiments, the regulatory elements of the nucleotidesequence of the presently disclosed subject matter are modified. Suchregulatory elements are easily obtainable by screening a genomic libraryusing the nucleotide sequence of the presently disclosed subject matter,or a portion thereof, as a probe. The existing regulatory elements arereplaced by different regulatory elements, thus altering expression ofthe nucleotide sequence, or they are mutated or deleted, thus abolishingthe expression of the nucleotide sequence. In some embodiments, thenucleotide sequence is modified by deletion of a part of the nucleotidesequence or the entire nucleotide sequence, or by mutation. Expressionof a mutated polypeptide in a plant cell is also provided in thepresently disclosed subject matter. Recent refinements of this techniqueto disrupt endogenous plant genes have been disclosed (Kempin et al.,1997 and Miao & Lam, 1995).

In some embodiments, a mutation in the chromosomal copy of a nucleotidesequence is introduced by transforming a cell with a chimericoligonucleotide composed of a contiguous stretch of RNA and DNA residuesin a duplex conformation with double hairpin caps on the ends. Anadditional feature of the oligonucleotide is for example the presence of2′-O-methylation at the RNA residues. The RNA/DNA sequence is designedto align with the sequence of a chromosomal copy of a nucleotidesequence of the presently disclosed subject matter and to contain thedesired nucleotide change. For example, this technique is furtherillustrated in U.S. Pat. No. 5,501,967 and Zhu et al., 1999.

IV.C. Overexpression in a Plant Cell

In some embodiments, a nucleotide sequence of the presently disclosedsubject matter encoding a polypeptide is over-expressed. Examples ofnucleic acid molecules and expression cassettes for over-expression of anucleic acid molecule of the presently disclosed subject matter aredisclosed above. Methods known to those skilled in the art ofover-expression of nucleic acid molecules are also encompassed by thepresently disclosed subject matter.

In some embodiments, the expression of the nucleotide sequence of thepresently disclosed subject matter is altered in every cell of a plant.This can be obtained, for example, though homologous recombination or byinsertion into a chromosome. This can also be obtained, for example, byexpressing a sense or antisense RNA, zinc finger polypeptide or ribozymeunder the control of a promoter capable of expressing the sense orantisense RNA, zinc finger polypeptide, or ribozyme in every cell of aplant. Constitutive, inducible, tissue-specific, ordevelopmentally-regulated expression are also within the scope of thepresently disclosed subject matter and result in a constitutive,inducible, tissue-specific, or developmentally-regulated alteration ofthe expression of a nucleotide sequence of the presently disclosedsubject matter in the plant cell. Constructs for expression of the senseor antisense RNA, zinc finger polypeptide, or ribozyme, or forover-expression of a nucleotide sequence of the presently disclosedsubject matter, can be prepared and transformed into a plant cellaccording to the teachings of the presently disclosed subject matter,for example, as disclosed herein.

IV.D. Construction of Plant Expression Vectors

Coding sequences intended for expression in transgenic plants can befirst assembled in expression cassettes operably linked to a suitablepromoter expressible in plants. The expression cassettes can alsocomprise any further sequences required or selected for the expressionof the transgene. Such sequences include, but are not limited to,transcription terminators, extraneous sequences to enhance expressionsuch as introns, vital sequences, and sequences intended for thetargeting of the gene product to specific organelles and cellcompartments. These expression cassettes can then be easily transferredto the plant transformation vectors disclosed below. The following is adescription of various components of typical expression cassettes.

IV.D.1. Promoters

The selection of the promoter used in expression cassettes can determinethe spatial and temporal expression pattern of the transgene in thetransgenic plant. Selected promoters can express transgenes in specificcell types (for example, leaf epidermal cells, mesophyll cells, rootcortex cells, and/or endosperm cells) or in specific tissues or organs(for example, roots, leaves, flowers, and/or seeds) and the selectioncan reflect the desired location for accumulation of the gene product.Alternatively, the selected promoter can drive expression of the geneunder various inducing conditions. Promoters vary in their strengths;i.e., their abilities to promote transcription. Depending upon the hostcell system utilized, any one of a number of suitable promoters can beused, including the gene's native promoter. The following arenon-limiting examples of promoters that can be used in expressioncassettes.

IV.D.1.a. Constitutive Expression: the Ubiquitin Promoter

Ubiquitin is a gene product known to accumulate in many cell types andits promoter has been cloned from several species for use in transgenicplants (e.g. sunflower—Binet et al., 1991; maize—Christensen & Quail,1989; and Arabidopsis—Callis et al., 1990; Norris et al., 1993). Themaize ubiquitin promoter has been developed in transgenic monocotsystems and its sequence and vectors constructed for monocottransformation are disclosed in the patent publication EP 0 342 926 (toLubrizol) which is herein incorporated by reference. Taylor et al.,1993, describes a vector (pAHC25) that comprises the maize ubiquitinpromoter and first intron and its high activity in cell suspensions ofnumerous monocotyledons when introduced via microprojectile bombardment.The Arabidopsis ubiquitin promoter is suitable for use with thenucleotide sequences of the presently disclosed subject matter. Theubiquitin promoter is suitable for gene expression in transgenic plants,both monocotyledons and dicotyledons. Suitable vectors are derivativesof pAHC25 or any of the transformation vectors disclosed herein,modified by the introduction of the appropriate ubiquitin promoterand/or intron sequences.

IV.D.1.b. Constitutive Expression: the CaMV 35S Promoter

Construction of the plasmid pCGN1761 is disclosed in the publishedpatent application EP 0 392 225 (Example 23), which is herebyincorporated by reference. pCGN1761 contains the “double” CaMV 35Spromoter and the tmI transcriptional terminator with a unique EcoRI sitebetween the promoter and the terminator and has a pUC-type backbone. Aderivative of pCGN1761 is constructed which has a modified polylinkerthat includes NotI and XhoI sites in addition to the existing EcoRIsite. This derivative is designated pCGN1761 ENX. pCGN1761 ENX is usefulfor the cloning of cDNA sequences or coding sequences (includingmicrobial ORF sequences) within its polylinker for the purpose of theirexpression under the control of the 35S promoter in transgenic plants.The entire 35S promoter-coding sequence-tmI terminator cassette of sucha construction can be excised by Hind III, Sph I, Sal I, and Xba I sites5′ to the promoter and Xba I, BamH I and Bgl I sites 3′ to theterminator for transfer to transformation vectors such as thosedisclosed below. Furthermore, the double 35S promoter fragment can beremoved by 5′ excision with Hind III, Sph I, Sal I, Xba I, or Pst I, and3′ excision with any of the polylinker restriction sites (EcoR I, Not Ior Xho I) for replacement with another promoter. If desired,modifications around the cloning sites can be made by the introductionof sequences that can enhance translation. This is particularly usefulwhen overexpression is desired. For example, pCGN1761ENX can be modifiedby optimization of the translational initiation site as disclosed inExample 37 of U.S. Pat. No. 5,639,949, incorporated herein by reference.

IV.D.1.c. Constitutive Expression: the Actin Promoter

Several isoforms of actin are known to be expressed in most cell typesand consequently the actin promoter can be used as a constitutivepromoter. In particular, the promoter from the rice ActI gene has beencloned and characterized (McElroy et al., 1990). A 1.3 kilobase (kb)fragment of the promoter was found to contain all the regulatoryelements required for expression in rice protoplasts. Furthermore,numerous expression vectors based on the ActI promoter have beenconstructed specifically for use in monocotyledons (McElroy et al.,1991). These incorporate the ActI-intron 1, Adhl 5′ flanking sequence(from the maize alcohol dehydrogenase gene) and Adhl-intron 1 andsequence from the CaMV 35S promoter. Vectors showing highest expressionwere fusions of 35S and ActI intron or the ActI 5′ flanking sequence andthe ActI intron. Optimization of sequences around the initiating ATG (ofthe β-glucuronidase (GUS) reporter gene) also enhanced expression. Thepromoter expression cassettes disclosed in McElroy et al., 1991, can beeasily modified for gene expression and are particularly suitable foruse in monocotyledonous hosts. For example, promoter-containingfragments are removed from the McElroy constructions and used to replacethe double 35S promoter in PCGN1761 ENX, which is then available for theinsertion of specific gene sequences. The fusion genes thus constructedcan then be transferred to appropriate transformation vectors. In aseparate report, the rice ActI promoter with its first intron has alsobeen found to direct high expression in cultured barley cells (Chibbaret al., 1993).

IV.D.1.d. Inducible Expression: PR-1 Promoters

The double 35S promoter in pCGN1761ENX can be replaced with any otherpromoter of choice that will result in suitably high expression levels.By way of example, one of the chemically regulatable promoters disclosedin U.S. Pat. No. 5,614,395, such as the tobacco PR-1a promoter, canreplace the double 35S promoter. Alternately, the Arabidopsis PR-1promoter disclosed in Lebel et al., 1998, can be used. The promoter ofchoice can be excised from its source by restriction enzymes, but canalternatively be PCR-amplified using primers that carry appropriateterminal restriction sites. Should PCR-amplification be undertaken, thepromoter can be re-sequenced to check for amplification errors after thecloning of the amplified promoter in the target vector. Thechemically/pathogen regulatable tobacco PR-1a promoter is cleaved fromplasmid pCIB1004 (for construction, see example 21 of EP 0 332 104,which is hereby incorporated by reference) and transferred to plasmidpCGN1761ENX (Uknes et al., 1992). pCIB1004 is cleaved with Nco I and theresulting 3′ overhang of the linearized fragment is rendered blunt bytreatment with T4 DNA polymerase. The fragment is then cleaved with HindIII and the resultant PR-1a promoter-containing fragment is gel purifiedand cloned into pCGN1761ENX from which the double 35S promoter has beenremoved. This is accomplished by cleavage with Xho I and blunting withT4 polymerase, followed by cleavage with Hind III, and isolation of thelarger vector-terminator containing fragment into which the pCIB1004promoter fragment is cloned. This generates a pCGN1761 ENX derivativewith the PR-1a promoter and the tmI terminator and an interveningpolylinker with unique EcoR I and Not I sites. The selected codingsequence can be inserted into this vector, and the fusion products (i.e.promoter-gene-terminator) can subsequently be transferred to anyselected transformation vector, including those disclosed herein.Various chemical regulators can be employed to induce expression of theselected coding sequence in the plants transformed according to thepresently disclosed subject matter, including the benzothiadiazole,isonicotinic acid, and salicylic acid compounds disclosed in U.S. Pat.Nos. 5,523,311 and 5,614,395.

IV.D.1.e. Inducible Expression: an Ethanol-Inducible Promoter

A promoter inducible by certain alcohols or ketones, such as ethanol,can also be used to confer inducible expression of a coding sequence ofthe presently disclosed subject matter. Such a promoter is for examplethe alcA gene promoter from Aspergillus nidulans (Caddick et al., 1998).In A. nidulans, the alcA gene encodes alcohol dehydrogenase 1, theexpression of which is regulated by the AlcR transcription factors inpresence of the chemical inducer. For the purposes of the presentlydisclosed subject matter, the CAT coding sequences in plasmid palcA:CATcomprising a alcA gene promoter sequence fused to a minimal 35S promoter(Caddick et al., 1998) are replaced by a coding sequence of thepresently disclosed subject matter to form an expression cassette havingthe coding sequence under the control of the alcA gene promoter. This iscarried out using methods known in the art.

IV.D.1.f. Inducible Expression: a Glucocorticoid-Inducible Promoter

Induction of expression of a nucleotide sequence of the presentlydisclosed subject matter using systems based on steroid hormones is alsoprovided. For example, a glucocorticoid-mediated induction system isused (Aoyama & Chua, 1997) and gene expression is induced by applicationof a glucocorticoid, for example a synthetic glucocorticoid, for exampledexamethasone, at a concentration ranging in some embodiments from 0.1mM to 1 mM, and in some embodiments from 10 mM to 100 mM. For thepurposes of the presently disclosed subject matter, the luciferase genesequences Aoyama & Chua, 1997 are replaced by a nucleotide sequence ofthe presently disclosed subject matter to form an expression cassettehaving a nucleotide sequence of the presently disclosed subject matterunder the control of six copies of the GAL4 upstream activatingsequences fused to the 35S minimal promoter. This is carried out usingmethods known in the art. The trans-acting factor comprises the GAL4DNA-binding domain (Keegan et al., 1986) fused to the transactivatingdomain of the herpes viral polypeptide VP16 (Triezenberg et al., 1988)fused to the hormone-binding domain of the rat glucocorticoid receptor(Picard et al., 1988). The expression of the fusion polypeptide iscontrolled either by a promoter known in the art or disclosed herein. Aplant comprising an expression cassette comprising a nucleotide sequenceof the presently disclosed subject matter fused to the 6×GAL4/minimalpromoter is also provided. Thus, tissue- or organ-specificity of thefusion polypeptide is achieved leading to inducible tissue- ororgan-specificity of the nucleotide sequence to be expressed.

IV.D.1.g. Root Specific Expression

Another pattern of gene expression is root expression. A suitable rootpromoter is the promoter of the maize metallothionein-like (MTL) genedisclosed in de Framond, 1991, and also in U.S. Pat. No. 5,466,785, eachof which is incorporated herein by reference. This “MTL” promoter istransferred to a suitable vector such as pCGN1761 ENX for the insertionof a selected gene and subsequent transfer of the entirepromoter-gene-terminator cassette to a transformation vector ofinterest.

IV. D.1.h. Wound-Inducible Promoters

Wound-inducible promoters can also be suitable for gene expression.Numerous such promoters have been disclosed (e.g. Xu et al., 1993;Logemann et al., 1989; Rohrmeier & Lehle, 1993; Firek et al., 1993;Warner et al., 1993) and all are suitable for use with the presentlydisclosed subject matter. Logemann et al. describe the 5′ upstreamsequences of the dicotyledonous potato wunl gene. Xu et al. show that awound-inducible promoter from the dicotyledon potato (pin2) is active inthe monocotyledon rice. Further, Rohrmeier & Lehle describe the cloningof the maize Wipl cDNA that is wound induced and which can be used toisolate the cognate promoter using standard techniques. Similarly, Fireket al., and Warner et al. have disclosed a wound-induced gene from themonocotyledon Asparagus officinalis, which is expressed at local woundand pathogen invasion sites. Using cloning techniques well known in theart, these promoters can be transferred to suitable vectors, fused tothe genes pertaining to the presently disclosed subject matter, and usedto express these genes at the sites of plant wounding.

IV.D.1.i. Pith-Preferred Expression

PCT International Publication WO 93/07278, which is herein incorporatedby reference, describes the isolation of the maize trpA gene, which ispreferentially expressed in pith cells. The gene sequence and promoterextending up to −1726 basepairs (bp) from the start of transcription arepresented. Using standard molecular biological techniques, thispromoter, or parts thereof, can be transferred to a vector such aspCGN1761 where it can replace the 35S promoter and be used to drive theexpression of a foreign gene in a pith-preferred manner. In fact,fragments containing the pith-preferred promoter or parts thereof can betransferred to any vector and modified for utility in transgenic plants.

IV.D.1.i. Leaf-Specific Expression

A maize gene encoding phosphoenol carboxylase (PEPC) has been disclosedby Hudspeth & Grula, 1989. Using standard molecular biologicaltechniques, the promoter for this gene can be used to drive theexpression of any gene in a leaf-specific manner in transgenic plants.

IV.D.1.k. Pollen-Specific Expression

WO 93/07278 describes the isolation of the maize calcium-dependentprotein kinase (CDPK) gene that is expressed in pollen cells. The genesequence and promoter extend up to 1400 basepairs (bp) from the start oftranscription. Using standard molecular biological techniques, thispromoter or parts thereof can be transferred to a vector such aspCGN1761 where it can replace the 35S promoter and be used to drive theexpression of a nucleotide sequence of the presently disclosed subjectmatter in a pollen-specific manner.

IV.D.1.l. Seed-Specific Expression

In some embodiments, nucleic acid molecules of the presently disclosedsubject matter can be expressed in the seed, and/or in a protein bodypresent within a seed. As disclosed herein, a nucleotide sequenceisolated from the Q protein gene, SEQ ID NO: 6, can be used to directexpression of heterologous sequences in the seeds of plants. Usingstandard molecular biological techniques, this promoter or parts thereofcan be transferred to a vector such as pCGN1761 where it can replace the35S promoter and be used to drive the expression of a nucleotidesequence of the presently disclosed subject matter in a seed-specificmanner.

IV.D.2. Transcriptional Terminators

A variety of transcriptional terminators are available for use inexpression cassettes. These are responsible for termination oftranscription and correct mRNA polyadenylation. Appropriatetranscriptional terminators are those that are known to function inplants and include the CaMV 35S terminator, the tmI terminator, thenopaline synthase terminator, and the pea rbcS E9 terminator. These canbe used in both monocotyledons and dicotyledons. In addition, a gene'snative transcription terminator can be used.

IV.D.3. Sequences for the Enhancement or Regulation of Expression

Numerous sequences have been found to enhance gene expression fromwithin the transcriptional unit and these sequences can be used inconjunction with the genes of the presently disclosed subject matter toincrease their expression in transgenic plants.

Various intron sequences have been shown to enhance expression,particularly in monocotyledonous cells. For example, the introns of themaize Adhl gene have been found to significantly enhance the expressionof the wild-type gene under its cognate promoter when introduced intomaize cells. Intron 1 was found to be particularly effective andenhanced expression in fusion constructs with the chloramphenicolacetyltransferase gene (Callis et al., 1987). In the same experimentalsystem, the intron from the maize bronze 1 gene had a similar effect inenhancing expression. Intron sequences have been routinely incorporatedinto plant transformation vectors, typically within the non-translatedleader.

A number of non-translated leader sequences derived from viruses arealso known to enhance expression, and these are particularly effectivein dicotyledonous cells. Specifically, leader sequences from TobaccoMosaic Virus (TMV; the “W-sequence”), Maize Chlorotic Mottle Virus(MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effectivein enhancing expression (see e.g. Gallie et al., 1987; Skuzeski et al.,1990). Other leader sequences known in the art include, but are notlimited to, picornavirus leaders, for example, encephalomyocarditisvirus (EMCV) leader (5′ noncoding region; see Elroy-Stein et al., 1989);potyvirus leaders, for example, from Tobacco Etch Virus (TEV; seeAllison et al., 1986); Maize Dwarf Mosaic Virus (MDMV; see Kong &Steinbiss 1998); human immunoglobulin heavy-chain binding polypeptide(BiP) leader (Macejak & Sarnow, 1991); untranslated leader from the coatpolypeptide mRNA of alfalfa mosaic virus (AMV; RNA 4; see Jobling &Gehrke, 1987); tobacco mosaic virus (TMV) leader (Gallie et al., 1989);and Maize Chlorotic Mottle Virus (MCMV) leader (Lommel et al., 1991).See also, Della-Cioppa et al., 1987.

In addition to incorporating one or more of the aforementioned elementsinto the 5′ regulatory region of a target expression cassette of thepresently disclosed subject matter, other elements can also beincorporated. Such elements include, but are not limited to, a minimalpromoter. By minimal promoter it is intended that the basal promoterelements are inactive or nearly so in the absence of upstream ordownstream activation. Such a promoter has low background activity inplants when there is no transactivator present or when enhancer orresponse element binding sites are absent. One minimal promoter that isparticularly useful for target genes in plants is the Bz1 minimalpromoter, which is obtained from the bronze 1 gene of maize. The Bz1core promoter is obtained from the “myc” mutant Bz1-luciferase constructpBz1LucR98 via cleavage at the NheI site located at positions −53 to −58(Roth et al., 1991). The derived Bz1 core promoter fragment thus extendsfrom positions −53 to +227 and includes the Bz1 intron-1 in the 5′untranslated region. Also useful for the presently disclosed subjectmatter is a minimal promoter created by use of a synthetic TATA element.The TATA element allows recognition of the promoter by RNA polymerasefactors and confers a basal level of gene expression in the absence ofactivation (see generally, Mukumoto et al., 1993; Green, 2000.

IV.D.4. Targeting of the Gene Product within the Cell

Various mechanisms for targeting gene products are known to exist inplants and the sequences controlling the functioning of these mechanismshave been characterized in some detail. For example, the targeting ofgene products to the chloroplast is controlled by a signal sequencefound at the amino terminal end of various polypeptides that is cleavedduring chloroplast import to yield the mature polypeptides (see e.g.,Comai et al., 1988). These signal sequences can be fused to heterologousgene products to affect the import of heterologous products into thechloroplast (Van den Broeck et al., 1985). DNA encoding for appropriatesignal sequences can be isolated from the 5′ end of the cDNAs encodingthe ribulose-1,5-bisphosphate carboxylase/oxygenase (RUBISCO)polypeptide, the chlorophyll a/b binding (CAB) polypeptide, the5-enol-pyruvyl shikimate-3-phosphate (EPSP) synthase enzyme, the GS2polypeptide and many other polypeptides which are known to bechloroplast localized. See also, the section entitled “Expression WithChloroplast Targeting” in Example 37 of U.S. Pat. No. 5,639,949, hereinincorporated by reference.

Other gene products can be localized to other organelles such as themitochondrion and the peroxisome (e.g. Unger et al., 1989). The cDNAsencoding these products can also be manipulated to effect the targetingof heterologous gene products to these organelles. Examples of suchsequences are the nuclear-encoded ATPases and specific aspartate aminotransferase isoforms for mitochondria. Targeting cellular polypeptidebodies has been disclosed by Rogers et al., 1985.

In addition, sequences have been characterized that control thetargeting of gene products to other cell compartments. Amino terminalsequences are responsible for targeting to the endoplasmic reticulum(ER), the apoplast, and extracellular secretion from aleurone cells(Koehler & Ho, 1990). Additionally, amino terminal sequences inconjunction with carboxy terminal sequences are responsible for vacuolartargeting of gene products (Shinshi et al., 1990).

By the fusion of the appropriate targeting sequences disclosed above totransgene sequences of interest it is possible to direct the transgeneproduct to any organelle or cell compartment. For chloroplast targeting,for example, the chloroplast signal sequence from the RUBISCO gene, theCAB gene, the EPSP synthase gene, or the GS2 gene is fused in frame tothe amino terminal ATG of the transgene. The signal sequence selectedcan include the known cleavage site, and the fusion constructed can takeinto account any amino acids after the cleavage site that are requiredfor cleavage. In some cases this requirement can be fulfilled by theaddition of a small number of amino acids between the cleavage site andthe transgene ATG or, alternatively, replacement of some amino acidswithin the transgene sequence. Fusions constructed for chloroplastimport can be tested for efficacy of chloroplast uptake by in vitrotranslation of in vitro transcribed constructions followed by in vitrochloroplast uptake using techniques disclosed by Bartlett et al., 1982and Wasmann et al., 1986. These construction techniques are well knownin the art and are equally applicable to mitochondria and peroxisomes.

And finally, using a nucleotide sequence comprising the Q proteinpromoter (SEQ ID NO: 6), seed-specific expression of a heterologoussequence can be accomplished. Thus, an expression construct comprising aheterologous coding sequence operably linked to SEQ ID NO: 6 can beemployed to direct expression of the nucleotide sequence in the seed,and in some embodiments, can be used to produce protein bodiescomprising the polypeptide encoded by the heterologous coding sequence.

The above-disclosed mechanisms for cellular targeting can be utilizednot only in conjunction with their cognate promoters, but also inconjunction with heterologous promoters so as to effect a specificcell-targeting goal under the transcriptional regulation of a promoterthat has an expression pattern different from that of the promoter fromwhich the targeting signal derives.

IV.E. Construction of Plant Transformation Vectors

Numerous transformation vectors available for plant transformation areknown to those of ordinary skill in the plant transformation art, andthe genes pertinent to the presently disclosed subject matter can beused in conjunction with any such vectors. The selection of vector willdepend upon the selected transformation technique and the target speciesfor transformation. For certain target species, different antibiotic orherbicide selection markers might be employed. Selection markers usedroutinely in transformation include the nptII gene, which confersresistance to kanamycin and related antibiotics (Messing & Vieira, 1982;Bevan et al., 1983); the bar gene, which confers resistance to theherbicide phosphinothricin (White et al., 1990; Spencer et al., 1990);the hph gene, which confers resistance to the antibiotic hygromycin(Blochinger & Diggelmann, 1984); the dhfr gene, which confers resistanceto methotrexate (Bourouis & Jarry, 1983); the EPSP synthase gene, whichconfers resistance to glyphosate (U.S. Pat. Nos. 4,940,935 and5,188,642); and the mannose-6-phosphate isomerase gene, which providesthe ability to metabolize mannose (U.S. Pat. Nos. 5,767,378 and5,994,629).

IV.E.1. Vectors Suitable for Agrobacterium Transformation

Many vectors are available for transformation using Agrobacteriumtumefaciens. These typically carry at least one T-DNA border sequenceand include vectors such as pBIN19 (Bevan, 1984). Below, theconstruction of two typical vectors suitable for Agrobacteriumtransformation is disclosed.

IV.E.1.a. pCIB200 and pCIB2001

The binary vectors pCIB200 and pCIB2001 are used for the construction ofrecombinant vectors for use with Agrobacterium and are constructed inthe following manner. pTJS75kan is created by Nar I digestion of pTJS75(Schmidhauser & Helinski, 1985) allowing excision of thetetracycline-resistance gene, followed by insertion of an Acc I fragmentfrom pUC4K carrying an NPTII sequence (Messing & Vieira, 1982: Bevan etal., 1983: McBride & Summerfelt. 1990). Xho I linkers are ligated to theEcoRV fragment of PCIB7 which contains the left and right T-DNA borders,a plant selectable nos/nptII chimeric gene and the pUC polylinker(Rothstein et al., 1987), and the Xho I-digested fragment are clonedinto Sal I-digested pTJS75kan to create pCIB200 (see also EP 0 332 104,example 19). pCIB200 contains the following unique polylinkerrestriction sites: EcoR I, Sst I, Kpn I, Bgl II, Xba I, and Sal I.pCIB2001 is a derivative of pCIB200 created by the insertion into thepolylinker of additional restriction sites. Unique restriction sites inthe polylinker of pCIB2001 are EcoR I, Sst I, Kpn I, Bgl II, Xba I, SalI, Mlu I, Bcl I, Avr II, Apa I, Hpa I, and Stu I. pCIB2001, in additionto containing these unique restriction sites, also has plant andbacterial kanamycin selection, left and right T-DNA borders forAgrobacterium-mediated transformation, the RK2-derived trfA function formobilization between E. coli and other hosts, and the OriTand OriVfunctions also from RK2. The pCIB2001 polylinker is suitable for thecloning of plant expression cassettes containing their own regulatorysignals.

IV.E.1.b. pCIB10 and Hygromycin Selection Derivatives Thereof

The binary vector pCIB10 contains a gene encoding kanamycin resistancefor selection in plants, T-DNA right and left border sequences, andincorporates sequences from the wide host-range plasmid pRK252 allowingit to replicate in both E. coli and Agrobacterium. Its construction isdisclosed by Rothstein et al., 1987. Various derivatives of pCIB10 canbe constructed which incorporate the gene for hygromycin Bphosphotransferase disclosed by Gritz & Davies, 1983. These derivativesenable selection of transgenic plant cells on hygromycin only (pCIB743),or hygromycin and kanamycin (pCIB715, pCIB717).

IV.E.2. Vectors Suitable for Non-Agrobacterium Transformation

Transformation without the use of Agrobacterium tumefaciens circumventsthe requirement for T-DNA sequences in the chosen transformation vector,and consequently vectors lacking these sequences can be utilized inaddition to vectors such as the ones disclosed above that contain T-DNAsequences. Transformation techniques that do not rely on Agrobacteriuminclude transformation via particle bombardment, protoplast uptake (e.g.polyethylene glycol (PEG) and electroporation), and microinjection. Thechoice of vector depends largely on the species being transformed.Below, the construction of typical vectors suitable fornon-Agrobacterium transformation is disclosed.

IV.E.2.a. PCIB3064

pCIB3064 is a pUC-derived vector suitable for direct gene transfertechniques in combination with selection by the herbicide BASTA®(glufosinate ammonium or phosphinothricin). The plasmid pCIB246comprises the CaMV 35S promoter in operational fusion to the E. coliβ-glucuronidase (GUS) gene and the CaMV 35S transcriptional terminatorand is disclosed in the PCT International Publication WO 93/07278. The35S promoter of this vector contains two ATG sequences 5′ of the startsite. These sites are mutated using standard PCR techniques in such away as to remove the ATGs and generate the restriction sites Ssp I andPvu II. The new restriction sites are 96 and 37 bp away from the uniqueSal I site and 101 and 42 bp away from the actual start site. Theresultant derivative of pCIB246 is designated pCIB3025. The GUS gene isthen excised from pCIB3025 by digestion with Sal I and Sac I, thetermini rendered blunt and religated to generate plasmid pCIB3060. Theplasmid pJIT82 is obtained from the John Innes Centre, Norwich, England,and the 400 bp Sma I fragment containing the bar gene from Streptomycesviridochromogenes is excised and inserted into the Hpa I site ofpCIB3060 (Thompson et al., 1987). This generated pCIB3064, whichcomprises the bar gene under the control of the CaMV 35S promoter andterminator for herbicide selection, a gene for ampicillin resistance(for selection in E. coli) and a polylinkerwith the unique sites Sph I,Pst I, Hind III, and BamHI. This vector is suitable for the cloning ofplant expression cassettes containing their own regulatory signals.

IV.E.2.b. pSOG19 and pSOG35

pSOG35 is a transformation vector that utilizes the E. colidihydrofolate reductase (DHFR) gene as a selectable marker conferringresistance to methotrexate. PCR is used to amplify the 35S promoter(−800 bp), intron 6 from the maize Adh1 gene (−550 bp), and 18 bp of theGUS untranslated leader sequence from pSOG10. A 250-bp fragment encodingthe E. coli dihydrofolate reductase type II gene is also amplified byPCR and these two PCR fragments are assembled with a Sac I-Pst Ifragment from pB1221 (BD Biosciences Clontech, Palo Alto, Calif., UnitedStates of America) that comprises the pUC19 vector backbone and thenopaline synthase terminator. Assembly of these fragments generatespSOG19 that contains the 35S promoter in fusion with the intron 6sequence, the GUS leader, the DHFR gene, and the nopaline synthaseterminator. Replacement of the GUS leader in pSOG19 with the leadersequence from Maize Chlorotic Mottle Virus (MCMV) generates the vectorpSOG35. pSOG19 and pSOG35 carry the pUC gene for ampicillin resistanceand have Hind III, Sph I, Pst I, and EcoR I sites available for thecloning of foreign substances.

IV.E.3. Vector Suitable for Chloroplast Transformation

For expression of a nucleotide sequence of the presently disclosedsubject matter in plant plastids, plastid transformation vector pPH143(PCT International Publication WO 97/32011, example 36) is used. Thenucleotide sequence is inserted into pPH143 thereby replacing theprotoporphyrinogen oxidase (Protox) coding sequence. This vector is thenused for plastid transformation and selection of transformants forspectinomycin resistance. Alternatively, the nucleotide sequence isinserted in pPH143 so that it replaces the aadH gene. In this case,transformants are selected for resistance to Protox inhibitors.

IV.F. Transformation

Once a nucleotide sequence of the presently disclosed subject matter hasbeen cloned into an expression system, it is transformed into a plantcell. The receptor and target expression cassettes of the presentlydisclosed subject matter can be introduced into the plant cell in anumber of art-recognized ways. Methods for regeneration of plants arealso well known in the art. For example, Ti plasmid vectors have beenutilized for the delivery of foreign DNA, as well as direct DNA uptake,liposomes, electroporation, microinjection, and microprojectiles. Inaddition, bacteria from the genus Agrobacterium can be utilized totransform plant cells. Below are descriptions of representativetechniques for transforming both dicotyledonous and monocotyledonousplants, as well as a representative plastid transformation technique.

IV.F.1. Transformation of Dicotyledons

Transformation techniques for dicotyledons are well known in the art andinclude Agrobacterium-based techniques and techniques that do notrequire Agrobacterium. Non-Agrobacterium techniques involve the uptakeof heterologous genetic material directly by protoplasts or cells. Thiscan be accomplished by PEG or electroporation-mediated uptake, particlebombardment-mediated delivery, or microinjection. Examples of thesetechniques are disclosed in Paszkowski et al., 1984; Potrykus et al.,1985; Reich et al., 1986; and Klein et al., 1987. In each case thetransformed cells are regenerated to whole plants using standardtechniques known in the art.

Agrobacterium-mediated transformation is a useful technique fortransformation of dicotyledons because of its high efficiency oftransformation and its broad utility with many different species.Agrobacterium transformation typically involves the transfer of thebinary vector carrying the foreign DNA of interest (e.g. pCIB200 orpCIB2001) to an appropriate Agrobacterium strain which can depend on thecomplement of vir genes carried by the host Agrobacterium strain eitheron a co-resident Ti plasmid or chromosomally (e.g. strain CIB542 forpCIB200 and pCIB2001 (Uknes et al., 1993). The transfer of therecombinant binary vector to Agrobacterium is accomplished by atriparental mating procedure using E. coli carrying the recombinantbinary vector, a helper E. coli strain that carries a plasmid such aspRK2013 and which is able to mobilize the recombinant binary vector tothe target Agrobacterium strain. Alternatively, the recombinant binaryvector can be transferred to Agrobacterium by DNA transformation (Hofgen& Willmitzer, 1988).

Transformation of the target plant species by recombinant Agrobacteriumusually involves co-cultivation of the Agrobacterium with explants fromthe plant and follows protocols well known in the art. Transformedtissue is regenerated on selectable medium carrying the antibiotic orherbicide resistance marker present between the binary plasmid T-DNAborders.

Another approach to transforming plant cells with a gene involvespropelling inert or biologically active particles at plant tissues andcells. This technique is disclosed in U.S. Pat. Nos. 4,945,050;5,036,006; and 5,100,792; all to Sanford et al. Generally, thisprocedure involves propelling inert or biologically active particles atthe cells under conditions effective to penetrate the outer surface ofthe cell and afford incorporation within the interior thereof. Wheninert particles are utilized, the vector can be introduced into the cellby coating the particles with the vector containing the desired gene.Alternatively, the target cell can be surrounded by the vector so thatthe vector is carried into the cell by the wake of the particle.Biologically active particles (e.g., dried yeast cells, dried bacterium,or a bacteriophage, each containing DNA sought to be introduced) canalso be propelled into plant cell tissue.

IV.F.2. Transformation of Monocotyledons

Transformation of most monocotyledon species has now also becomeroutine. Exemplary techniques include direct gene transfer intoprotoplasts using PEG or electroporation, and particle bombardment intocallus tissue. Transformations can be undertaken with a single DNAspecies or multiple DNA species (i.e. co-transformation), and both thesetechniques are suitable for use with the presently disclosed subjectmatter. Co-transformation can have the advantage of avoiding completevector construction and of generating transgenic plants with unlinkedloci for the gene of interest and the selectable marker, enabling theremoval of the selectable marker in subsequent generations, should thisbe regarded as desirable. However, a disadvantage of the use ofco-transformation is the less than 100% frequency with which separateDNA species are integrated into the genome (Schocher et al., 1986).

Patent Applications EP 0 292 435, EP 0 392 225, and WO 93/07278 describetechniques for the preparation of callus and protoplasts from an eliteinbred line of maize, transformation of protoplasts using PEG orelectroporation, and the regeneration of maize plants from transformedprotoplasts. Gordon-Kamm et al., 1990 and Fromm et al., 1990 havepublished techniques for transformation of A188-derived maize line usingparticle bombardment. Furthermore, WO 93/07278 and Koziel et al., 1993describe techniques for the transformation of elite inbred lines ofmaize by particle bombardment. This technique utilizes immature maizeembryos of 1.5-2.5 mm length excised from a maize ear 14-15 days afterpollination and a PDS-1000/He Biolistic particle delivery device(Bio-Rad Laboratories, Hercules, Calif., United States of America) forbombardment.

Transformation of rice can also be undertaken by direct gene transfertechniques utilizing protoplasts or particle bombardment.Protoplast-mediated transformation has been disclosed for Japonica-typesand Indica-types (Zhang et al., 1988; Shimamoto et al., 1989; Dafta etal., 1990) of rice. Both types are also routinely transformable usingparticle bombardment (Christou et al., 1991). Furthermore, WO 93/21335describes techniques for the transformation of rice via electroporation.Casas et al., 1993 discloses the production of transgenic sorghum plantsby microprojectile bombardment.

European Patent Application EP 0 332 581 describes techniques for thegeneration, transformation, and regeneration of Pooideae protoplasts.These techniques allow the transformation of Dactylis and wheat.Furthermore, wheat transformation has been disclosed in Vasil et al.,1992 using particle bombardment into cells of type C long-termregenerable callus, and also by Vasil et al., 1993 and Weeks et al.,1993 using particle bombardment of immature embryos and immatureembryo-derived callus.

A representative technique for wheat transformation, however, involvesthe transformation of wheat by particle bombardment of immature embryosand includes either a high sucrose or a high maltose step prior to genedelivery. Prior to bombardment, embryos (0.75-1 mm in length) are platedonto MS medium with 3% sucrose (Murashige & Skoog, 1962) and 3 mg/l2,4-dichlorophenoxyacetic acid (2,4-D) for induction of somatic embryos,which is allowed to proceed in the dark. On the chosen day ofbombardment, embryos are removed from the induction medium and placedonto the osmoticum (i.e. induction medium with sucrose or maltose addedat the desired concentration, typically 15%). The embryos are allowed toplasmolyze for 2-3 hours and are then bombarded. Twenty embryos pertarget plate are typical, although not critical. An appropriategene-carrying plasmid (such as pCIB3064 or pSG35) is precipitated ontomicrometer size gold particles using standard procedures. Each plate ofembryos is shot with biolistics device using a burst pressure of about1000 pounds per square inch (psi) using a standard 80 mesh screen. Afterbombardment, the embryos are placed back into the dark to recover forabout 24 hours (still on osmoticum). After 24 hours, the embryos areremoved from the osmoticum and placed back onto induction medium wherethey stay for about a month before regeneration. Approximately one monthlater the embryo explants with developing embryogenic callus aretransferred to regeneration medium (MS+1 mg/liter NAA, 5 mg/liter GA),further containing the appropriate selection agent (10 mg/l BASTA® inthe case of pCIB3064 and 2 mg/l methotrexate in the case of pSOG35).After approximately one month, developed shoots are transferred tolarger sterile containers known as “GA7s” which contain half-strengthMS, 2% sucrose, and the same concentration of selection agent.

Transformation of monocotyledons using Agrobacterium has also beendisclosed. See WO 94/00977 and U.S. Pat. No. 5,591,616, both of whichare incorporated herein by reference. See also Negrotto et al., 2000,incorporated herein by reference. Zhao et al., 2000 specificallydiscloses transformation of sorghum with Agrobacterium. See also U.S.Pat. No. 6,369,298.

Rice (Oryza sativa) can be used for generating transgenic plants.Various rice cultivars can be used (Hiei et al., 1994; Dong et al.,1996; Hiei et al., 1997). Also, the various media constituents disclosedbelow can be either varied in quantity or substituted. Embryogenicresponses are initiated and/or cultures are established from matureembryos by culturing on MS-CIM medium (MS basal salts, 4.3 g/liter; B5vitamins (200×), 5 ml/liter; sucrose, 30 g/liter; proline, 500 mg/liter;glutamine, 500 mg/liter; casein hydrolysate, 300 mg/liter; 2,4-D (1mg/ml), 2 ml/liter; pH adjusted to 5.8 with 1 N KOH; Phytagel, 3g/liter). Either mature embryos at the initial stages of cultureresponse or established culture lines are inoculated and co-cultivatedwith the Agrobacterium tumefaciens strain LBA4404 (Agrobacterium)containing the desired vector construction. Agrobacterium is culturedfrom glycerol stocks on solid YPC medium (plus 100 mg/L spectinomycinand any other appropriate antibiotic) for about 2 days at 28° C.Agrobacterium is re-suspended in liquid MS-CIM medium. The Agrobacteriumculture is diluted to an OD₆₀₀ of 0.2-0.3 and acetosyringone is added toa final concentration of 200 μM. Acetosyringone is added before mixingthe solution with the rice cultures to induce Agrobacterium for DNAtransfer to the plant cells. For inoculation, the plant cultures areimmersed in the bacterial suspension. The liquid bacterial suspension isremoved and the inoculated cultures are placed on co-cultivation mediumand incubated at 22° C. for two days. The cultures are then transferredto MS-CIM medium with ticarcillin (400 mg/liter) to inhibit the growthof Agrobacterium. For constructs utilizing the PMI selectable markergene (Reed et al., 2001), cultures are transferred to selection mediumcontaining mannose as a carbohydrate source (MS with 2% mannose, 300mg/liter ticarcillin) after 7 days, and cultured for 3-4 weeks in thedark. Resistant colonies are then transferred to regeneration inductionmedium (MS with no 2,4-D, 0.5 mg/liter IAA, 1 mg/liter zeatin, 200mg/liter TIMENTIN®, 2% mannose, and 3% sorbitol) and grown in the darkfor 14 days. Proliferating colonies are then transferred to anotherround of regeneration induction media and moved to the light growthroom. Regenerated shoots are transferred to GA7 containers with GA7-1medium (MS with no hormones and 2% sorbitol) for 2 weeks and then movedto the greenhouse when they are large enough and have adequate roots.Plants are transplanted to soil in the greenhouse (T₀ generation) grownto maturity and the T₁ seed is harvested.

IV.F.3. Transformation of Plastids

Seeds of Nicotiana tabacum c.v. ‘Xanthi nc’ are germinated seven perplate in a 1″ circular array on T agar medium and bombarded 12-14 daysafter sowing with 1 μm tungsten particles (M10, Bio-Rad Laboratories,Hercules, Calif., United States of America) coated with DNA fromplasmids pPH143 and pPH145 essentially as disclosed (Svab & Maliga,1993). Bombarded seedlings are incubated on T medium for two days afterwhich leaves are excised and placed abaxial side up in bright light(350-500 mmol photons/m²/s) on plates of RMOP medium (Svab et al., 1990)containing 500 μg/ml spectinomycin dihydrochloride (Sigma, St. Louis,Mo., United States of America). Resistant shoots appearing underneaththe bleached leaves three to eight weeks after bombardment are subclonedonto the same selective medium, allowed to form callus, and secondaryshoots isolated and subcloned. Complete segregation of transformedplastid genome copies (homoplasmicity) in independent subclones isassessed by standard techniques of Southern blotting (Sambrook &Russell, 2001). BamH I/EcoR I-digested total cellular DNA (Mettler,1987) is separated on 1% Tris-borate-EDTA (TBE) agarose gels,transferred to nylon membranes (Amersham Biosciences, Piscataway, N.J.,United States of America) and probed with ³²P-labeled random primed DNAsequences corresponding to a 0.7 kb BamH I/Hind III DNA fragment frompC8 containing a portion of the rps7/12 plastid targeting sequence.Homoplasmic shoots are rooted aseptically on spectinomycin-containingMS/IBA medium (McBride et al., 1994) and transferred to the greenhouse.

V. Plants, Breeding, and Seed Production

V.A. Plants

The presently disclosed subject matter also provides plants comprisingthe disclosed compositions. In some embodiments, the modificationincludes being enriched for an essential amino acid as a proportion of apolypeptide fraction of the plant. In some embodiments, the polypeptidefraction can be, for example, total seed polypeptide, solublepolypeptide, insoluble polypeptide, water-extractable polypeptide, andlipid-associated polypeptide. In some embodiments, the modificationincludes overexpression, underexpression, antisense modulation, sensesuppression, inducible expression, inducible repression, or induciblemodulation of a gene.

V.B. Breeding

The plants obtained via transformation with a nucleotide sequence of thepresently disclosed subject matter can be any of a wide variety of plantspecies, including monocots and dicots; however, the plants used in themethod for the presently disclosed subject matter are selected in someembodiments from the list of agronomically important target crops setforth hereinabove. The expression of a gene of the presently disclosedsubject matter in combination with other characteristics important forproduction and quality can be incorporated into plant lines throughbreeding. Breeding approaches and techniques are known in the art. Seee.g., Welsh, 1981; Wood, 1983; Mayo, 1987; Singh, 1986; Wricke & Weber,1986.

The genetic properties engineered into the transgenic seeds and plantsdisclosed above are passed on by sexual reproduction or vegetativegrowth and can thus be maintained and propagated in progeny plants.Generally, the maintenance and propagation make use of knownagricultural methods developed to fit specific purposes such as tilling,sowing, or harvesting. Specialized processes such as hydroponics orgreenhouse technologies can also be applied. As the growing crop isvulnerable to attack and damage caused by insects or infections as wellas to competition by weed plants, measures are undertaken to controlweeds, plant diseases, insects, nematodes, and other adverse conditionsto improve yield. These include mechanical measures such as tillage ofthe soil or removal of weeds and infected plants, as well as theapplication of agrochemicals such as herbicides, fungicides,gametocides, nematicides, growth regulants, ripening agents, andinsecticides.

Depending on the desired properties, different breeding measures aretaken. The relevant techniques are well known in the art and include,but are not limited to, hybridization, inbreeding, backcross breeding,multiline breeding, variety blend, interspecific hybridization,aneuploid techniques, etc. Hybridization techniques can also include thesterilization of plants to yield male or female sterile plants bymechanical, chemical, or biochemical means. Cross-pollination of a malesterile plant with pollen of a different line assures that the genome ofthe male sterile but female fertile plant will uniformly obtainproperties of both parental lines. Thus, the transgenic seeds and plantsaccording to the presently disclosed subject matter can be used for thebreeding of improved plant lines that, for example, increase theeffectiveness of conventional methods such as herbicide or pesticidetreatment or allow one to dispense with said methods due to theirmodified genetic properties. Alternatively new crops with improvedstress tolerance can be obtained, which, due to their optimized genetic“equipment”, yield harvested product of better quality than productsthat were not able to tolerate comparable adverse developmentalconditions (for example, drought).

V.C. Seed Production

Some embodiments of the presently disclosed subject matter also provideseed and isolated product from plants that comprise an expressioncassette comprising a promoter sequence operably linked to an isolatednucleic acid, the nucleotide sequence being selected from the groupconsisting of:

-   -   (a) a nucleotide sequence that hybridizes under highly stringent        conditions of hybridization of 65° C. in 6×SSC, followed by a        final washing step of at least 15 minutes at 65° C. in 0.1×SSC        to a nucleotide sequence as set forth in SEQ ID NO: 1, or        fragment, domain, or feature thereof;    -   (b) a nucleotide sequence encoding a polypeptide that is an        ortholog of a polypeptide of SEQ ID NO: 2, or a fragment, domain        or feature thereof;    -   (c) a nucleotide sequence complementary (for example, fully        complementary) to (a) or (b); and    -   (d) a nucleotide sequence that is the reverse complement (for        example, its full reverse complement) of (a) or (b) according to        the present disclosure.

In some embodiments the isolated product comprises an enzyme, anutritional polypeptide, a structural polypeptide, an amino acid, alipid, a fatty acid, a polysaccharide, a sugar, an alcohol, an alkaloid,a carotenoid, a propanoid, a steroid, a pigment, a vitamin, or a planthormone.

Embodiments of the presently disclosed subject matter also relate toisolated products produced by expression of an isolated nucleic acidcontaining a nucleotide sequence selected from the group consisting of:

-   -   (a) a nucleotide sequence that hybridizes under highly stringent        conditions of hybridization of 65° C. in 6×SSC, followed by a        final washing step of at least 15 minutes at 65° C. in 0.1×SSC        to a nucleotide sequence as set forth in SEQ ID NO: 1, or a        fragment, domain, or feature thereof;    -   (b) a nucleotide sequence encoding a polypeptide that is an        ortholog of a polypeptide listed in SEQ ID NO: 2, or a fragment,        domain, or feature thereof;    -   (c) a nucleotide sequence complementary (for example, fully        complementary) to (a) or (b); and    -   (d) a nucleotide sequence that is the reverse complement (for        example, its full reverse complement) of (a) or (b) according to        the present disclosure.

In some embodiments, the product is produced in a plant. In someembodiments, the product is produced in cell culture. In someembodiments, the product is produced in a cell-free system. In someembodiments, the product comprises an enzyme, a nutritional polypeptide,a structural polypeptide, an amino acid, a lipid, a fatty acid, apolysaccharide, a sugar, an alcohol, an alkaloid, a carotenoid, apropanoid, a steroid, a pigment, a vitamin, or a plant hormone. In someembodiments, the product is a polypeptide comprising an amino acidsequence listed in SEQ ID NO: 2, or ortholog thereof. In someembodiments, the polypeptide comprises an enzyme.

In seed production, germination quality, and uniformity of seeds areessential product characteristics. As it is difficult to keep a cropfree from other crop and weed seeds, to control seedborne diseases, andto produce seed with good germination, fairly extensive and well-definedseed production practices have been developed by seed producers who areexperienced in the art of growing, conditioning, and marketing of pureseed. Thus, it is common practice for the farmer to buy certified seedmeeting specific quality standards instead of using seed harvested fromhis own crop. Propagation material to be used as seeds is customarilytreated with a protectant coating comprising herbicides, insecticides,fungicides, bactericides, nematicides, molluscicides, or mixturesthereof. Customarily used protectant coatings comprise compounds such ascaptan, carboxin, thiram (tetramethylthiuram disulfide; TMTD®; availablefrom R. T. Vanderbilt Company, Inc., Norwalk, Conn., United States ofAmerica), methalaxyl (APRON XL®; available from Syngenta Corp.,Wilmington, Del., United States of America), and pirimiphos-methyl(ACTELLIC®; available from Agriliance, LLC, St. Paul, Minn., UnitedStates of America). If desired, these compounds are formulated togetherwith further carriers, surfactants, and/or application-promotingadjuvants customarily employed in the art of formulation to provideprotection against damage caused by bacterial, fungal, or animal pests.The protectant coatings can be applied by impregnating propagationmaterial with a liquid formulation or by coating with a combined wet ordry formulation. Other methods of application are also possible such astreatment directed at the buds or the fruit.

VI. Additional Applications

The presently disclosed subject matter also provides methods fortargeting a protein of interest to a structure of a plant cell. In someembodiments, the structure is selected from the group including but notlimited to endoplasmic reticulum (ER) and apoplast. In some embodiments,the method comprises (a) fusing a nucleic acid molecule encoding asignal sequence of a Zea mays Q protein in frame to a nucleotidesequence encoding the protein of interest, wherein the nucleic acidmolecule encoding a signal sequence of a Zea mays Q protein and thenucleotide sequence encoding the protein of interest are operably linkedto a promoter to produce a plant expression construct; and (b)transforming the plant cell with the plant expression construct, wherebythe protein of interest is targeted to the structure. In someembodiments, the signal sequence corresponds to the first 19 amino acidof SEQ ID NO: 2.

As used herein, the term “protein of interest” refers to any polypeptideor polypeptide fragment for which the expression in a plant or plantcell would be desirable. Exemplary proteins of interest include, but arenot limited to carbohydrases, cellulases, hemicellulases, pectinases,isomerases, lyases, proteases, heat shock proteins, chaperonins,phytases, insecticidal proteins, antimicrobial proteins, α-amylases,glucoamylases, glucanases, glucosidases, xylanases, ferulic acidesterases, galactosidases, pectinases, and chymosins. In someembodiments, the protein of interest is naturally occurring in the plantcell, but one or more extra copies of a nucleic acid sequence encodingthe polypeptide of interest is provided as a transgene. In someembodiments, the protein of interest is heterologous to the plant orplant cell.

The presently disclosed subject matter also provides methods forproducing a plant seed with an increased nutritional value. In someembodiments, the method comprises (a) transforming a plant cell with anexpression vector comprising a nucleotide sequence encoding SEQ ID NO:2, or a fragment or derivative thereof; (b) regenerating a plant fromthe transformed plant cell; and (c) isolating a seed from theregenerated plant, whereby a seed with an increased nutritional value isproduced.

As used herein, the phrase “a seed with increased nutritional value”refers to a seed that has been modified such to contain a polypeptidethat is not normally found in the seed or contains an elevated amount ofa polypeptide that is normally found in the seed. In some embodiments,the polypeptide that is not normally found in the seed is a derivativeof a protein that the seed normally contains. Such a derivative can becharacterized, for example, by a modification of the amino acid sequenceof a naturally occurring protein to include a higher content of one ormore amino acids (for example, an increased proportion of one or moreessential amino acids), an improved amino acid balance, and/or animproved amino acid digestibility when compared to a seed from anon-transformed plant of the same species. These modifications of theamino acid sequence can be accomplished by mutagenesis of a nucleotidesequence encoding the polypeptide using techniques that are known to theskilled artisan.

The presently disclosed subject matter also provides methods fortargeting a protein of interest to a protein body in a plant. In someembodiments, the method comprises (a) fusing a nucleic acid moleculeencoding SEQ ID NO: 2, or a fragment or derivative thereof, in frame toa nucleotide sequence encoding the protein of interest, wherein thenucleic acid molecule encoding SEQ ID NO: 2, or the fragment orderivative thereof, and the nucleotide sequence encoding the protein ofinterest are operably linked to a promoter to produce a plant expressionconstruct; and (b) transforming the plant cell with the plant expressionconstruct, whereby the protein of interest is targeted to a protein bodyin the plant. In some embodiments, the nucleic acid molecule encodingSEQ ID NO: 2, or a fragment or derivative thereof, which is fused inframe to a nucleotide sequence encoding the protein of interest includesa nucleotide sequence encoding a protease cleavage site between thenucleic acid molecule encoding SEQ ID NO: 2, or a fragment or derivativethereof, and the nucleotide sequence encoding the protein of interest.

Proteases for which a cleavage site can be introduced in the fusionprotein, and the amino acid sequences of recognized cleavage sites, areknown in the arts of molecular biology and protein expression, andinclude, but are not limited to Factor Xa, thrombin, the family ofcaspases, chymotrypsin, pepsin, tobacco etch virus protease, andtrypsin. See Sambrook & Russell, 2001, and references cited therein, fora discussion of the use of proteases to cleave fusion proteins. Theprotease can be chosen based upon its recognition sequence, for example,based on the absence of the recognition sequence in the protein ofinterest, such that when the fusion protein is treated with theprotease, the protein of interest will be released from the Q protein(or fragment or derivative thereof intact. Additional purification stepsthat are known to the skilled artisan can then be used to purify theprotein of interest. These techniques include affinity purification (ifan antibody to the protein of interest is available), SDS-PAGEseparation, size based chromatography, or any other method available tothe skilled artisan.

EXAMPLES

The following Examples have been included to illustrate representativeand exemplary modes of the presently disclosed subject matter. In lightof the present disclosure and the general level of skill in the art,those of skill will appreciate that the following Examples are intendedto be exemplary only and that numerous changes, modifications, andalterations can be employed without departing from the spirit and scopeof the presently disclosed subject matter.

Standard recombinant DNA and molecular cloning techniques used here arewell known in the art and are disclosed in Sambrook & Russell, 2001;Silhavy et al., 1984; Ausubel et al., 2002; Ausubel et al., 2003; Reiteret al., 1992; and Schultz et al., 1998.

Example 1 Isolation of Promoter/Genomic Clones Corresponding to the QProtein Gene

The nucleotide sequence from the Q protein ORF (SEQ ID NO: 1) was usedto design primers for genomic walking. A genomic clone corresponding tothe Q protein cDNA was isolated via PCR utilizing maize libraries basedon the GENOME WALKER™ kit (BD Biosciences Clontech, Palo Alto, Calif.,United States of America) as template. GENOME WALKER™ adaptor-ligatedmaize genomic libraries were constructed using DNA from maize cultivar6N615. Five different libraries were constructed, each comprisinggenomic DNA fragments generated by digestion with one of the followingblunt end restriction enzymes: Dra I, EcoR V, Hinc II, Ssp I, or Stu I.GENOME WALKER™ libraries were screened utilizing one or more of the genespecific primers QP1 (SEQ ID NO: 4) and QP2 (SEQ ID NO: 5) with theadaptor primer supplied in the kit. PCR conditions were those suggestedin BD Bioscience Clontech's user's manual. PCR products werefractionated on agarose gels, and the resulting product band wasexcised, subcloned into a TOPO® vector (Invitrogen Corp., Carlsbad,Calif., United States of America) and sequenced to verify the correctproduct. A 516 base pair genomic clone corresponding to the Q proteingene was cloned from the Dra I GENOME WALKER™ library.

Example 2 Activity of the Q Promoter

Transgenic maize plants were stably transformed with binary vector 11037using Agrobacterium-mediated transformation. Vector 11037 is identicalto pNOV4061 (SEQ ID NO: 14: see also PCT International PatentApplication Publication WO 03/57248) with the exception that the γ-zeinpromoter found in the phytase expression cassette in pNOV4061 isreplaced with the Q promoter (SEQ ID NO: 6) in vector 11037. The geneproduct of the phytase expression cassette in vector 11037 is presentedin SEQ ID NO: 12, and is identical to SEQ ID NO: 6 of PCT InternationalPatent Application Publication WO 03/57248. It includes the 19 aminoacid signal sequence from the 27 kDa γ-zein (corresponds to the first 19amino acids of SEQ ID NO: 12), Nov9x phytase, and the hexapeptide SEKDEL(SEQ ID NO: 11).

T1 seed were harvested from plants regenerated from maize tissuetransformed with vector 11037. Seed were pulverized and soluble proteinswere extracted from flour samples using extraction buffer (50 mMTris-HCl, pH 8.0, 100 mM NaCl, 2 mM EDTA). Flour suspensions wereincubated at room temperature for 60 minutes with agitation, andinsoluble material was removed by centrifugation.

The measurement of phytase activity and detection of Nov9x phytase byWestern blot analysis was performed as described in Example 3 of PCT WO03/57248. All reagents and reaction conditions were as described in WO03/57248, and all reagent volumes were adjusted proportionately.Briefly, the measurement of phytase activity is based on the detectionof inorganic phosphate released from sodium phytate substrate by thehydrolytic action of phytase. Phytase assay procedures were adapted fromthe protocols of Wyss et al., 1999 and Engelen et al., 2001.

Phytase activity in flour samples from single kernels and pooled kernelswas measured at pH 5.5 and 37° C. Assay results for eight samples of T1seed harvested from two regenerated transgenic plants (plants A and B)are presented in Table 1. Sample 305B-11A represents the phytaseactivity extracted from a control corn flour sample containing Nov9xphytase. The samples were listed in order of decreasing phytaseactivity. TABLE 1 Demonstration of Phytase Activity in Transgenic CornFlour Phytase Units per Sample^(a) Gram of Corn Flour 305B-11A (positivecontrol) 1775 Plant A, seed #3 451 Plant A, seed 10 392 Plant B, seed 5314 Plant A, seed 4 292 Plant B (pooled seed) 291 Plant B, seed 4 255Plant B, seed 7 204 Plant A, seed 5 191^(a)Each sample was from a numbered, single seed, with the exception ofthe sample designated as a pooled seed, which corresponds to 10 T1kernels pulverized together.

Flour extracts analyzed in Table 1 were also analyzed by Western blotanalysis using antisera to purified Nov9x phytase as shown in FIG. 1.The first lane on the blot depicted in FIG. 1 indicates the positions ofmolecular weight standards (labeled on left). A188 is an extract of seedfrom non-transgenic corn. Nov9x phytase is identified as the prominentband migrating in the region of the 55 kDa standard. Nov9x phytase wasnot detected in the A188 corn flour extract.

Example 3 SDS-PAGE Analysis of Proteins Extracted from Seeds

Dried kernels from Hi-II×A188 maize were soaked in distilled water at 4°C. overnight. Endosperm and embryos were separated and frozen at −80° C.Frozen tissue was pulverized using a KLECO tissue pulverizer (KLECO,Visalia, Calif., United States of America). Proteins were extracted fromendosperm flour by addition of 500 μl extraction buffer (50 mM HEPES, pH8,2 mM EDTA, 100 mM NaCl, and 4 mM DTT) for every 100 mg flour. Sampleswere vortexed and rocked at room temperature for 40 minutes. Proteinswere extracted from the embryo paste by addition of 2.5 ml extractionbuffer for every 100 mg tissue. A portion of the soluble fraction washeated at 80° C. for 20 minutes, and aggregated proteins were removed bycentrifugation. Samples of the extract before (−) and after (+) heatingwere analyzed by SDS-PAGE (see FIG. 2).

Discussion of Examples 1-3

A 55 kDa protein is the second most abundant protein released fromendosperm flour in buffer and DTT. In the course of testing differentconditions for extracting a recombinant thermostable enzyme from maizeflour, an abundant endosperm protein of 55 kDa was identified thatremained soluble during heating at 80° C. (Example 3, FIG. 2). The 55kDa protein was the second most abundant protein released from endospermflour in buffer containing 100 mM NaCl and 4 mM DTT. The most abundantprotein in these extracts was the 27 kDa γ-zein, a maize prolamin thattypically constitutes 15% of total endosperm protein.

A similar endosperm fraction was described by Vitale et al., 1982. Theyincubated protein bodies in the presence of DTT or β-ME andcharacterized proteins in the soluble fraction. The two most abundantproteins released from protein bodies under these conditions hadmolecular weights of 28,000 and 58,000. Proteins in this fraction werenamed “Reduced Soluble Proteins” or RSPs.

An EST in GENBANK® matches the first 20 codons of the 55 kDa protein andencodes a polypeptide chain of 171 residues. The amino acid sequences atthe amino-termini of the 27 kDa and 55 kDa proteins discussed above weredetermined. The partial sequence of the 27 kDa protein wasTHTSGGXXXQPPPPVHHPP (SEQ ID NO: 7). This sequence matches theamino-terminus of the 28 kDa maize glutelin-2 (γ-zein; Prat et al.,1985; Boronat et al., 1986).

The partial sequence of the 55 kDa protein was TQTGGCSCGQQQSHEQQHHP (SEQID NO: 8). A search of the PIR and SwissProt databases did not identifyany matches to this sequence. However, when a back-translated DNAsequence was used to screen GENBANK®, the query sequence matched at eachcodon with an expressed sequence tag (EST) from an endosperm cDNAlibrary (GENBANK® Accession No. AI812147). Translation of the ESTsequence starting with the codon that corresponds to the amino-terminusof the 55 kDa protein yields a deduced polypeptide chain of 171residues.

Cloning of two overlapping cDNAs that encode a Protein of 289 residues(molecular weight 34,000). Oligonucleotide primers against the ESTsequence were used to amplify a cDNA clone from a maize cDNA library.The predicted stop codon in the EST sequence was found to encode a Leuin the new clone, and the ORF continued to the end of the clonedsequence. A second cDNA clone that overlapped with the 3′ end of thefirst was amplified from the same library. The new ORF encoded anadditional 118 amino acids, for a total of 289. The calculated molecularweight of the deduced protein is 34,000. The discrepancy between thecalculated molecular weight (34,000) and that estimated from SDS-PAGEanalysis (55,000) could be due to unusual structural features related tothe abundance of Gin residues. The amino acid composition of the deducedprotein is shown in Table 2. TABLE 2 Amino Acid Composition of ZeinsAmino zein- zein- zein- zein- zein- zein- SEQ ID Acid α19 α22 β15 δ10δ18 γ27 NO: 2 A 29 34 23 7 9 10 6 C 2 1 7 5 3 15 16 D 0 0 1 1 0 0 3 E 12 3 0 0 2 13 F 13 8 0 5 5 2 3 G 5 2 14 4 5 13 11 H 2 3 0 3 3 16 29 I 911 1 3 8 4 8 K 0 0 0 0 1 0 5 L 43 42 16 15 13 19 11 M 0 5 18 29 53 1 4 N10 13 3 3 3 0 8 P 23 22 14 20 35 51 16 Q 41 51 26 15 15 30 104 R 2 4 5 00 5 4 S 15 17 8 8 18 8 23 T 5 7 5 5 10 9 6 V 5 17 3 5 5 15 9 W 0 0 0 0 20 0 Y 8 7 14 1 2 4 10 Total 213 246 161 129 190 204 289

The amino acid sequence at the C-terminus shows homology to prolamins,the seed storage proteins of cereals. The amino acid sequence fromresidues 172-289 of the mature protein shows homology to prolamins ofseveral cereals including maize, rice, and barley. Sequence identitiesrange from 36-53% for stretches of 58-118 residues: 37% (44/116residues) for a rice prolamin, 38% (46/118) for a maize zein, 36%(40/115) for the sorghum prolamin γ-kafirin, 53% (31/58) for oat avenin,and 55% (33/59) for barley B-hordein. Identity with 27 kDa γ-zein(glutelin-2) was 38% over a stretch of 119 residues (46 matches). Thehomologous region was limited to the C-terminal domain of γ-zein.

The amino acid compositions of the major zeins are included in Table 2for comparison with the 55 kDa protein. The deduced sequence isremarkable by comparison to the other zeins with respect to relativelevels of three amino acids: the deduced sequence contains 106 Gln, 13Glu, and 5 Lys out of 289 total residues. All other zeins have 0 Lys,except for the 18 kDa δ-zein, which has just 1.

The percent composition of each amino acid in the deduced protein andother zeins is shown in Table 3. The 55 kDa protein contains 36% Gln. Bycomparison, the α-zeins are the next most Gln-rich prolamins and containonly 20% Gln. The 55 kDa protein also has the lowest Pro content ofmaize prolamins. The amino acid composition of the 58 kDa RSPcharacterized by Vitale et al., 1982 is displayed side-by-side with thatfor the 55 kDa protein. TABLE 3 Percent Composition of Amino Acids inZeins SEQ Amino zein- zein- zein- zein- zein- zein- ID 58 kD Acid α19α22 β15 δ10 δ18 γ27 NO: 2 RSP G 2.3 0.8 8.7 3.1 2.6 6.4 3.8 4.9 A 13.613.8 14.3 5.4 4.7 4.9 2.1 3.5 V 2.3 6.9 1.9 3.9 2.6 7.4 3.1 5.1 L 20.217.1 9.9 11.6 6.8 9.3 3.8 4.7 I 4.2 4.5 0.6 2.3 4.2 2.0 2.8 2.2 M 0.02.0 11.2 22.5 17.9 0.5 1.4 1.3 F 6.1 3.3 0.0 3.9 2.6 1.0 1.0 1.9 W 0.00.0 0.0 0.0 1.1 0.0 0.0 0.0 K 0.0 0.0 0.0 0.0 0.5 0.0 1.7 2.6 P 10.8 8.98.7 15.5 18.4 25.0 5.5 4.7 C 0.9 0.4 4.3 3.9 1.6 7.4 5.5 4.4 S 7.0 6.95.0 6.2 9.5 3.9 8.0 7.3 T 2.3 2.8 3.1 3.9 5.3 4.4 2.1 2.5 Y 3.8 2.8 8.70.8 1.1 2.0 3.5 3.5 N 4.7 5.3 1.9 2.3 1.6 0.0 2.8 4.3 D 0.0 0.0 0.6 0.80.0 0.0 1.0 Q 19.2 20.7 16.1 11.6 7.9 14.7 36.0 33.6 E 0.5 0.8 1.9 0.00.0 1.0 4.5 H 0.9 1.2 0.0 2.3 1.6 7.8 10.0 11.0 R 0.9 1.6 3.1 0.0 0.02.5 1.4 2.1 SUM 100.0 100.0 100.0 100.0 100.0 100.0 100.0 99.6

Example 4 Accumulation of a Q Protein Fusion Protein in Transgenic Seed

Construct 11045 was a binary vector encoding a fusion protein of thefull-length Q protein linked to an α-galactosidase enzyme from T.maritima with a tripeptide linker Gly-Gly-Ala. The gene encoding thefusion protein was operably linked to the promoter of the 27 kDagamma-zein protein (SEQ ID NO: 13). Primers were designed to isolate anucleotide sequence encoding the Q-protein from pNOV4349 (SEQ ID NO: 17,of which nucleotides 10061-10090 correspond to the Q-protein codingsequence) while adding a 9 bp sequence encoding the tripeptide linker tothe 3′ end. 3′ Nco 1 and 5′ BamH I sites were added to the ends of thePCR product. The 953 bp PCR fragment was amplified from pNOV4349 and cutwith BamH I and Nco I, then inserted into the 5227 bp BamH I/Nco Ifragment from pNOV4325 (SEQ ID NO: 16; nucleotides 3565-5901 encode theα-galactosidase polypeptide). This plasmid was named pWIN010. ConstructpWIN010 was then cut with Kpn I and Hind III, and the 3489 bp fragmentwas ligated to the 9154 bp Kpn I/Hind III fragment from binary vectorpNOV2117 (SEQ ID NO: 15; see also PCT International Patent ApplicationPublication WO 03/57248).

Transgenic maize plants were stably transformed with binary vector 11045using Agrobacterium-mediated transformation. T1 seed were harvested fromplants regenerated from maize tissue transformed with vector 11045.Seeds were pulverized and soluble proteins were extracted from floursamples using extraction buffer (50 mM Tris-HCl, pH 8.0, 100 mM NaCl, 2mM EDTA). Flour suspensions were incubated at room temperature for 60minutes with agitation, and insoluble material was removed bycentrifugation. The corn flour samples were then analyzed forα-galactosidase activity. α-galactosidase activity results are presentedin Table 4. TABLE 4 α-galactosidase Activity in Trangenic Maizeα-galactosidase Sample ID Construct units per g of flour MD9L01188311045 4.06 MD9L011893 11045 9.31 MD9L011901 11045 8.39 MD9L011903 110459.28 MD9L011912 11045 7.79 Negative Corn Flour (none) −0.78

The measurement of α-galactosidase activity was based on thecolorimetric assay involving the release of p-nitrophenol fromp-nitrophenyl α-d-galactopyranoside as described in Liebl et al., 1998.

As shown in Table 4, the α-galactosidase activity of transgenic maizecontaining construct 11045 encoding the Q-protein/α-galactosidase fusionprotein showed an increase in α-galactosidase activity above thenegative corn flour control. The levels ranged from 4α-galactosidaseunits/gram of flour to 9 α-galactosidase units/gram of flour. Thenegative corn flour control has no α-galactosidase activity. Theseresults are also presented in FIG. 3.

Example 5 The Q Protein Signal Sequence Directs Accumulation of a FusionProtein to Transgenic Seed

Binary vector 12173 was made by designing primers to isolate the Qprotein signal sequence (amino acids 1-19 of SEQ ID NO: 2) from pNOV4349while adding a 3′ Nco I and a 5′ BamH I site to the ends of the PCRproduct. The 80 bp PCR fragment was amplified from pNOV4349 and cut withBamH I and Nco I, then inserted into the 5226 bp BamH I/Nco I fragmentfrom pNOV4328 (SEQ ID NO: 18; nucleotides 3621-5279 encode theα-galactosidase polypeptide). This plasmid was named pWIN013New.pWIN013New was then cut with Kpn I and Hind III, and the 2614 bpfragment was ligated to the 9154 bp Kpn I/Hind III fragment from binaryvector pNOV2117 (SEQ ID NO: 15; see also PCT International PatentApplication Publication WO 03/57248). This construct was named binaryvector 12173. It encodes a fusion protein containing the 19 amino acidsignal sequence from the Q protein (corresponds to the first 19 aminoacid of SEQ ID NO: 2) fused to the α-galactosidase protein under thetranscriptional control of a maize 27 kDa gamma zein promoter (SEQ IDNO: 13).

Transgenic maize plants were stably transformed with binary vector 12173using Agrobacterium-mediated transformation. T1 seed were harvested fromplants regenerated from maize tissue transformed with binary vector12173. Seed were pulverized and soluble proteins were extracted fromflour samples using extraction buffer (50 mM Tris-HCl, pH 8.0, 100 mMNaCl, 2 mM EDTA). Flour suspensions were incubated at room temperaturefor 60 minutes with agitation, and insoluble material was removed bycentrifugation. The corn flour samples were then analyzed forα-galactosidase activity. α-galactosidase activity results are presentedin Table 5. TABLE 5 α-galactosidase Activity in Trangenic Maizeα-galactosidase units Sample ID Construct per g flour MD9L020749 1217370.848 MD9L020735 12173 68.175 MD9L030752 12173 52.274 MD9L020758 1217343.160 MD9L019976 12173 23.976 MD9L019782 12173 11.877 MD9L019766 121732.566 Negative Corn Flour N/A −0.782

The α-galactosidase activity resulting from the presence of construct12173 containing the Q protein signal sequence showed the increasedpresence of the α-galactosidase protein. The levels ranged from3α-galactosidase units/gram of flour to 71α-galactosidase units/gram offlour. The negative corn flour control had no α-galactosidase protein.These results are also presented in FIG. 4.

REFERENCES

The references listed below, as well as all references cited in thespecification, are incorporated herein by reference in their entiretiesto the extent that they supplement, explain, provide a background for,or teach methodology, techniques, and/or compositions employed herein.

-   Allison et al. (1986) Virology 154:9-20.-   Altschul et al. (1990) J Mol Biol 215:403-410.-   Aoyama & Chua (1997) Plant J 11:605-612.-   Ausubel et al. (2002) Short Protocols in Molecular Biology, Fifth    ed. Wiley, New York, N.Y., United States of America.-   Ausubel et al., (2003) Current Protocols in Molecular Biology, John    Wylie & Sons, Inc., New York, N.Y., United States of America.-   Barany & Merrifield (1980) in Peptides: Analysis, Synthesis,    Biology, Vol. 2, Special Methods in Peptide Synthesis, Part A (Gross    E & Meienhofer J, eds) Academic Press, New York, N.Y., United States    of America, pp. 3-284.-   Bartlett et al. (1982) in Methods in Chloroplast Molecular Biology,    (Edelman M, Hallick R B & Chua N-H, eds.) Elsevier Biomedical Press,    New York, N.Y., United States of America, pp. 1081-1091.-   Batzer et al. (1991) Nucl Acids Res. 19:5081.-   Bevan (1984) Nucl Acids Res 12:8711-21.-   Bevan et al. (1983) Nature 304:184-187.-   Binet et al. (1991) Plant Mol Biol 17:395-407.-   Blochinger & Diggelmann (1984) Mol Cell Biol 4:2929-2931.-   Boronat et al. (1986) Plant Sci. 47:95-102.-   Bourouis & Jarry (1983) EMBO J. 2:1099-1104.-   Caddick et al. (1998) Nat Biotechnol 16:177-180.-   Callis et al. (1987) Genes Dev. 1:1183-1200.-   Callis et al. (1990) J Biol Chem 265:12486-12493.-   Casas et al. (1993) Proc Natl Acad Sci USA 90:11212-6.-   Chibbar et al. (1993) Plant Cell Rep 12:506-509.-   Christensen & Quail (1989) Plant Mol Biol 12:619-632.-   Christou et al. (1991) Bio/Technology 9: 957-962.-   Clark (ed.) (1997) Plant Molecular Biology: A Laboratory Manual,    Chapter 7, Springer-Verlag GmbH & Co. KG, Berlin, Germany.-   Comai et al. (1988) J Biol Chem 263:15104-15109.-   Creighton (1984) Proteins, WH Freeman & Co., New York, N.Y., United    States of America.-   Datta et al., (1990) Bio/Technology 8:736-740.-   de Framond (1991) FEBS Lett 290:103-6.-   Della-Cioppa et al. (1987) Plant Physiol 84:965-968.-   Deutscher (1990) Guide to Protein Purification (Simon & Abelson    (eds)), Meth Enzymol Volume 182.-   Dong, et al. (1996). Mol Breeding 2:267-276.-   Elroy-Stein et al. (1989) Proc Natl Acad Sci USA 86:6126-6130.-   Engelen et al. (2001) J AOAC Intl 84:629-33.-   EP 0 292 435; EP 0 332 104; EP 0 332 581; EP 0 342 926; EP 0 392    225.-   Firek et al. (1993) Plant Mol Biol 22:129-142.-   Fromm et al. (1990) Biotechnology (NY) 8:833-839.-   Gallie et al. (1989) in Molecular Biology of RNA (Cech (ed.)),    Alan R. Liss, Inc., New York, N.Y., United States of America, pp.    237-256.-   Gallie et al., (1987) Nucl Acids Res 15:8693-8711.-   Goff et al. (2002) Science 296:92-100.-   Gordon-Kamm et al. (1990) Plant Cell 2:603-618.-   Green (2000) Trends Biochem Sci 25:59-63.-   Gritz & Davies (1983) Gene 25:179-188.-   Henikoff & Henikoff (1992) Proc Natl Acad Sci USA 89:10915-10919.-   Hiei et al. (1997) Plant Mol Biol 35:205-18.-   Hiei et al. (1994) Plant J 6(2):271-282.-   Höfgen & Willmitzer (1988) Nucl Acids Res 16:9877.-   Hudspeth & Grula (1989) Plant Molec Biol 12:579-589.-   Jobling & Gehrke (1987) Nature 325:622-625.-   Karlin & Altschul (1993) Proc Natl Acad Sci USA 90:5873-5877.-   Keegan et al. (1986) Science 231:699-704.-   Kellogg (1998) Proc Natl Acad Sci USA 95:2005-2010-   Kempin et al. (1997) Nature 389:802-803.-   Klein et al. (1987) Nature 327:70-73.-   Koehler & Ho (1990) Plant Cell 2:769-783.-   Kong & Steinbiss (1998) Arch Virol 143:1791-1799.-   Koziel et al. (1993) Bio/Technology 11:194-200.-   Kuchler (1997) Biochemical Methods in Cell Culture and Virology,    Dowden, Hutchinson and Ross, Inc., Stroudsburg, Pa., United States    of America.-   Kyte & Doolittle (1982) J Mol Biol 157:105-132.-   Lebel et al. (1998) Plant J 16:223-233.-   Liebl et al. (1998) Syst Appl Microbiol 21:1-11.-   Logemann et al. (1989) Plant Cell 1:151-158.-   Lommel et al. (1991) Virology 81:382-385.-   Macejak & Sarnow (1991) Nature 353:90-94.-   Mayo (1987) The Theory of Plant Breeding, Second Edition, Clarendon    Press, New York, N.Y., United States of America.-   McBride et al. (1994) Proc Natl Acad Sci USA 91:7301-7305.-   McBride & Summerfelt (1990) Plant Mol Biol 14: 269-276.-   McElroy et al. (1991) Mol. Gen. Genet 231:150-160.-   McElroy et al. (1990) Plant Cell 2:163-71.-   Merrifield (1963) J Am Chem Soc 85:2149-54.-   Messing & Vieira (1982) Gene 19:259-268.-   Mettler (1987) Plant Mol Biol Reporter 5:346-349.-   Miao & Lam (1995) Plant J 7:359-365.-   Mukumoto et al. (1993) Plant Mol Biol 23: 995-1003.-   Murashige & Skoog (1962) Physiologia Plantarum 15: 473-497 (1962)-   Needleman & Wunsch (1970) J Mol Biol 48:443-453.-   Negrotto et al. (2000) Plant Cell Reports 19:798-803.-   Norris et al. (1993) Plant Mol Biol 21:895-906.-   Ohtsuka et al. (1985) J Biol Chem 260:2605-2608.-   Paszkowski et al. (1988) EMBO J. 7:4021-26.-   Paszkowski et al. (1984) EMBO J. 3:2717-2722.-   Paterson (1996) in Genome Mapping in Plants, Chapter 2, (Paterson    (ed.)), Academic Press/R.G. Lands Co., Austin, Tex., United States    of America.-   PCT International Patent Application Publications WO 93/07278; WO    93/21335; WO 94/00977; WO 97/32011; and WO 03/57248.-   Pearson & Lipman (1988) Proc Natl Acad Sci USA 85:2444-2448.-   Picard et al., (1988) Cell 54:1073-1080.-   Potrykus et al. (1985) Mol Gen Genet 199:169-177.-   Prat et al. (1985) Nucl Acids Res 13:1493-1504.-   Reed et al. (2001) In Vitro Cell Dev Biol-Plant 37:127-132.-   Reich et al. (1986) Bio/Technology 4:1001-1004.-   Reiter et al. (1992) in Methods in Arabidopsis Research, World    Scientific Press, River Edge, N.J., United States of America.-   Rogers et al. (1985) Proc Natl Acad Sci USA 82:6512-6516.-   Rohrmeier & Lehle (1993) Plant Mol Biol 22:783-792.-   Rossolini et al. (1994) Mol Cell Probes 8:91-98.-   Roth et al. (1991) Plant Cell 3:317-325.-   Rothstein et al. (1987) Gene 53:153-161.-   Sambrook & Russell (2001) Molecular Cloning: A Laboratory Manual,    3^(rd) ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor,    N.Y., United States of America.-   Schmidhauser & Helinski (1985) J Bacteriol 164:446-455.-   Schocher et al., (1986) Bio/Technology 4:1093-1096.-   Schultz et al., (1998) in Plant Molecular Biology Manual, 2nd    edition (Gelvin et al. (eds.)) Kluwer Academic Publishers, New York,    N.Y., United States of America.-   Scopes (1982) Protein Purification: Principles and Practice,    Springer-Verlag, New York, N.Y., United States of America.-   Sherman et al. (1982) Methods in Yeast Genetics, Cold Spring Harbor    Laboratory Press, Cold Spring Harbor, N.Y., United States of    America,-   Shimamoto et al. (1989) Nature 338:274-276.-   Shinshi et al. (1990) Plant Mol Biol 14:357-368.-   Silhavy et al. (1984) Experiments with Gene Fusions, Cold Spring    Harbor Laboratory, Cold Spring Harbor, N.Y., United States of    America.-   Singh (1986) Breeding for Resistance to Diseases and Insect Pests,    Springer-Verlag, New York, N.Y., United States of America.-   Skuzeski et al. (1990) Plant Mol Biol 15:65-79.-   Smith & Waterman (1981) Adv Appl Math 2:482-489.-   Song et al. (2001) Genome Res 12:1549-1555.-   Southern et al. (1991) J Gen Virol 72:1551-7.-   Spencer et al. (1990) Theor Appl Genet 79:625-631.-   Stewart & Young (1984) Solid Phase Peptide Synthesis, 2^(nd) ed.    Pierce Chemical Co., Rockford, Ill., United States of America.-   Svab et al. (1990) Proc Natl Acad Sci USA 87:8526-8530.-   Svab & Maliga (1.993) Proc Natl Acad Sci USA 90:913-917.-   Taylor et al. (1993) Plant Cell Rep 12:491-495.-   Thompson et al. (1987) EMBO J. 6:2519-2523.-   Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular    Biology-Hybridization with Nucleic Acid Probes. Elsevier, New York,    N.Y., United States of America.-   Torrent et al. (1997) Plant Mol Biol 34:139-149.-   Triezenberg et al. (1988) Genes Dev 2:718-729.-   Uknes et al. (1992) Plant Cell 4:645-656.-   Unger et al. (1989) Plant Mol Biol 13:411-418.-   U.S. Pat. Nos. 4,554,101; 4,940,935; 4,945,050; 5,036,006;    5,100,792; 5,188,642; 5,466,785; 5,501,967; 5,523,311; 5,591,616;    5,614,395; 5,639,949; 5,767,378; 5,994,629; and 6,369,298.-   Van den Broeck et al. (1985) Nature 313:358-363.-   Vasil et al. (1992) Bio/Technology 10:667-674.-   Vasil et al. (1993) Bio/Technology 11:1553-1558.-   Vitale et al. (1982) J Exp Bot 33:439-448.-   Wallace et al. (1988) Science 240:662-664.-   Warner et al. (1993) Plant J 3:191-201.-   Wasmann et al. (1986) Mol Gen Genet 205:446-453.-   Weeks et al. (1993) Plant Physiol 102:1077-1084.-   Welsh (1981) Fundamentals of Plant Genetics and Breeding, John Wiley    & Sons, New York, N.Y., United States of America.-   White et al. (1990) Nucl Acids Res 18:1062.-   Wood (ed.) (1983) Crop Breeding, American Society of Agronomy,    Madison, Wis., United States of America.-   Wricke & Weber (1986) Quantitative Genetics and Selection Plant    Breeding, Walter de Gruyter and Co., Berlin, Germany.-   Wyss et al. (1999) Appl Environ Microbiol 65:359-66.-   Xu et al. (1993) Plant Mol Biol 22:573-588.-   Zhang et al. (1988) Plant Cell Reports 7: 379-384.-   Zhao et al., (2000) Plant Mol Biol 44:789-98.-   Zhu et al. (1999) Proc Natl Acad Sci USA 96:8768-8773.

It will be understood that various details of the presently disclosedsubject matter can be changed without departing from the scope of thepresently disclosed subject matter. Furthermore, the foregoingdescription is for the purpose of illustration only, and not for thepurpose of limitation.

1-44. (canceled)
 45. A method of targeting a protein of interest to astructure of a plant cell selected from the group consisting ofendoplasmic reticulum (ER) and apoplast, the method comprising: (a)fusing a nucleic acid molecule encoding a signal sequence of a Zea maysQ protein in frame to a nucleotide sequence encoding the protein ofinterest, wherein the nucleic acid molecule encoding a signal sequenceof a Zea mays Q protein and the nucleotide sequence encoding the proteinof interest are operably linked to a promoter to produce a plantexpression construct; and (b) transforming the plant cell with the plantexpression construct, whereby the protein of interest is targeted to thestructure.
 46. A method of producing a plant seed with an increasednutritional value, the method comprising: (a) transforming a plant cellwith an expression vector comprising a nucleotide sequence encoding SEQID NO: 2, or a fragment or derivative thereof; (b) regenerating a plantfrom the transformed plant cell; and (c) isolating a seed from theregenerated plant, whereby a seed with an increased nutritional value isproduced.
 47. The method of claim 46, wherein the increased nutritionalvalue is selected from the group consisting of an increased level of anessential amino acid, an improved amino acid balance, and an improvedamino acid digestibility, when compared to a seed from a non-transformedplant of the same species.
 48. A method of targeting a protein ofinterest to a protein body in a plant, the method comprising: (a) fusinga nucleic acid molecule encoding SEQ ID NO: 2, or a fragment orderivative thereof, in frame to a nucleotide sequence encoding theprotein of interest, wherein the nucleic acid molecule encoding SEQ IDNO: 2, or the fragment or derivative thereof, and the nucleotidesequence encoding the protein of interest are operably linked to apromoter to produce a plant expression construct; and (b) transformingthe plant cell with the plant expression construct, whereby the proteinof interest is targeted to a protein body in the plant.
 49. The methodof claim 45, wherein the nucleic acid molecule encoding a signalsequence of a Zea mays Q protein encodes an amino acid sequencecomprising amino acids 1-19 of SEQ ID NO:
 2. 50. The method of claim 46,wherein the fragment of SEQ ID NO: 2 comprises amino acids 1-19 of SEQID NO:
 2. 51. The method of claim 48, wherein the fragment of SEQ ID NO:2 comprises amino acids 1-19 of SEQ ID NO: 2.